HoneyHive

HoneyHive is an AI evaluation and observability platform for Generative AI applications. HoneyHive’s platform gives developers enterprise-grade tools to debug complex retrieval pipelines, evaluate performance over large test suites, monitor usage in real-time, and manage prompts within a shared workspace. Teams use HoneyHive to iterate faster, detect failures at scale, and deliver exceptional AI products.

By integrating Solvio with HoneyHive, you can:

Trace vector database operations
Monitor latency, embedding quality, and context relevance
Evaluate retrieval performance in your RAG pipelines
Optimize paramaters such as chunk_size or chunk_overlap

Prerequisites

A HoneyHive account and API key
Python 3.8+

Installation

Install the required packages:

pip install solvio-client openai honeyhive

Basic Integration Example

The following example demonstrates a complete RAG pipeline with HoneyHive tracing for Solvio operations. We’ll break down each component step by step.

Initialize Clients and Setup

First, set up the necessary clients and configuration for HoneyHive, OpenAI, and Solvio:

from solvio_client import SolvioClient
from solvio_client.http.models import PointStruct, VectorParams, Distance
import openai
import os
from honeyhive.tracer import HoneyHiveTracer
from honeyhive.tracer.custom import trace
from openai import OpenAI

# Set API Keys
openai.api_key = os.getenv("OPENAI_API_KEY")
honeyhive_api_key = os.getenv("HONEYHIVE_API_KEY")

# Initialize HoneyHive Tracer
HoneyHiveTracer.init(
    api_key=honeyhive_api_key,
    project="solvio-rag-example",
    session_name="solvio-integration-demo"
)

# Initialize OpenAI client
openai_client = OpenAI(api_key=openai.api_key)

Connect to Solvio

You can connect to Solvio in two ways: self-hosted (local) or cloud-hosted (Solvio Cloud):

Option 1: Self-Hosted Solvio (Local)

To run Solvio locally, you need to have Docker installed and run the following command:

docker pull solvio/solvio
docker run -p 6333:6333 -p 6334:6334 -v "$(pwd)/solvio_storage:/solvio/storage" solvio/solvio

Then connect to the local Solvio instance:

# Connect to local Solvio
client = SolvioClient(url="http://localhost:6333")
print("Connected to local Solvio instance")

Option 2: Solvio Cloud

For Solvio Cloud, you’ll need your cluster host and API key:

# Solvio Cloud configuration
QDRANT_HOST = os.getenv("QDRANT_HOST")  # e.g., "your-cluster-id.eu-central.aws.cloud.solvio.io"
QDRANT_API_KEY = os.getenv("QDRANT_API_KEY")

# Connect to Solvio Cloud
client = SolvioClient(url=QDRANT_HOST, api_key=QDRANT_API_KEY)
print("Connected to Solvio Cloud")

Create a Collection

Create a collection to store document embeddings:

collection_name = "documents"
vector_size = 1536  # For text-embedding-3-small
vector_distance = Distance.COSINE

# Create collection if it doesn't exist
if not client.collection_exists(collection_name):
    client.create_collection(
        collection_name=collection_name,
        vectors_config=VectorParams(size=vector_size, distance=vector_distance)
    )

Define Embedding Function with Tracing

Create a function to generate embeddings with HoneyHive tracing:

@trace()
def embed_text(text: str) -> list:
    """Generate embeddings for a text using OpenAI's API."""
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

Insert Documents with Tracing

Create a function to insert documents into Solvio with tracing:

@trace()
def insert_documents(docs):
    """Insert documents into Solvio collection."""
    points = []
    for idx, doc in enumerate(docs):
        vector = embed_text(doc)
        points.append(PointStruct(
            id=idx + 1,
            vector=vector,
            payload={"text": doc}
        ))
    
    client.upsert(
        collection_name=collection_name,
        points=points
    )
    return len(points)

# Sample documents
documents = [
    "Solvio is a vector database optimized for storing and searching high-dimensional vectors.",
    "HoneyHive provides observability for AI applications, including RAG pipelines.",
    "Retrieval-Augmented Generation (RAG) combines retrieval systems with generative models.",
    "Vector databases like Solvio are essential for efficient similarity search in RAG systems.",
    "OpenAI's embedding models convert text into high-dimensional vectors for semantic search."
]

# Insert documents
num_inserted = insert_documents(documents)

Retrieve Documents with Tracing

Create a function to retrieve relevant documents from Solvio with tracing:

@trace()
def get_relevant_docs(query: str, top_k: int = 3) -> list:
    """Retrieve relevant documents for a query."""
    # Embed the query
    q_vector = embed_text(query)
    
    # Search in Solvio
    search_response = client.query_points(
        collection_name=collection_name,
        query=q_vector,
        limit=top_k,
        with_payload=True
    )
    
    # Extract results
    docs = []
    for point in search_response.points:
        docs.append({
            "id": point.id,
            "text": point.payload.get("text"),
            "score": point.score
        })
    
    return docs

Generate Response with Tracing

Create a function to generate a response using OpenAI with tracing:

@trace()
def answer_query(query: str, relevant_docs: list) -> str:
    """Generate an answer for a query using retrieved documents."""
    if not relevant_docs:
        return "Could not retrieve relevant documents to answer the query."

    # Format context from retrieved documents
    context_parts = []
    for i, doc in enumerate(relevant_docs):
        context_parts.append(f"Document {i+1} (ID: {doc['id']}, Score: {doc['score']:.4f}):\n{doc['text']}")
    context = "\n\n".join(context_parts)

    # Create prompt
    prompt = f"""Answer the question based ONLY on the following context:

Context:
{context}

Question: {query}

Answer:"""

    # Generate answer
    completion = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that answers questions based strictly on the provided context. If the answer is not in the context, say so clearly."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.2
    )

    return completion.choices[0].message.content.strip()

Complete RAG Pipeline

Create a function to run the complete RAG pipeline with tracing:

@trace()
def rag_pipeline(query: str) -> dict:
    """End-to-end RAG pipeline."""
    # Get relevant documents
    relevant_docs = get_relevant_docs(query)
    
    # Generate answer
    answer = answer_query(query, relevant_docs)
    
    return {
        "query": query,
        "answer": answer,
        "retrieved_documents": relevant_docs
    }

Batch Processing

For larger document sets, you can use batch processing to improve performance:

@trace()
def batch_insert_documents(documents_to_insert, batch_size=10, start_id_offset=0):
    """Insert documents in batches."""
    total_inserted = 0
    
    for i in range(0, len(documents_to_insert), batch_size):
        batch_docs = documents_to_insert[i:i+batch_size]
        points = []
        
        for local_idx, doc in enumerate(batch_docs):
            relative_idx = i + local_idx
            vector = embed_text(doc)
            point_id = relative_idx + start_id_offset + 1
            points.append(PointStruct(
                id=point_id,
                vector=vector,
                payload={"text": doc}
            ))
        
        if points:
            client.upsert(
                collection_name=collection_name,
                points=points
            )
            total_inserted += len(points)
    
    return total_inserted

Test the RAG Pipeline

Here’s how to test the complete RAG pipeline:

# Test query
test_query = "What is Solvio used for?"
result = rag_pipeline(test_query)

print(f"Query: {result['query']}")
print(f"Answer: {result['answer']}")
print("\nRetrieved Documents:")
for i, doc in enumerate(result['retrieved_documents']):
    print(f"Document {i+1} (ID: {doc['id']}, Score: {doc['score']:.4f}): {doc['text']}")

Viewing Traces in HoneyHive

After running your RAG pipeline with Solvio, you can view the traces in the HoneyHive UI:

Navigate to your project in the HoneyHive dashboard
Click on the “Traces” tab to see all the traces from your RAG pipeline
Click on a specific trace to see detailed information about each step in the pipeline
Analyze the performance of your vector operations, embeddings, and retrieval processes

With HoneyHive, you can easily monitor and optimize your Solvio-powered RAG pipeline, ensuring that it delivers the best possible results for your users.