PGVector
Quick Summary
DeepEval allows you to evaluate your PGVector retriever and optimize retrieval hyperparameters like top-K
, embedding model
, and similarity function
.
To get started, install PGVector and the PostgreSQL client using the following command:
pip install psycopg2 pgvector
PGVector is an open-source PostgreSQL extension that enables semantic search and similarity-based retrieval directly within PostgreSQL, making it a scalable, SQL-native solution for LLM applications and RAG pipelines. Learn more about PGVector here.
This diagram illustrates how PGVector fits into your RAG pipeline.

Setup PGVector
To get started, connect to your PostgreSQL database.
import psycopg2
import os
# Connect to PostgreSQL database
conn = psycopg2.connect(
dbname="your_database",
user="your_user",
password=os.getenv("PG_PASSWORD"), # Set in environment variable
host="localhost",
port="5432"
)
cursor = conn.cursor()
Next, create a table to store text
and embedding vectors
.
# Enable the pgvector extension (only needed once)
cursor.execute("CREATE EXTENSION IF NOT EXISTS vector;")
# Define table schema for text and embeddings
cursor.execute("""
CREATE TABLE IF NOT EXISTS documents (
id SERIAL PRIMARY KEY,
text TEXT,
embedding vector(384) -- Defines a 384-dimension vector
);
""")
conn.commit()
Finally, convert document chunks into vectors using an embedding model and insert them into PostgreSQL.
# Load an embedding model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
# Example document chunks
document_chunks = [
"PGVector brings vector search to PostgreSQL.",
"RAG improves AI-generated responses with retrieved context.",
"Vector search enables high-precision semantic retrieval.",
...
]
# Store chunks with embeddings in PGVector
for chunk in document_chunks:
embedding = model.encode(chunk).tolist() # Convert text to vector
cursor.execute(
"INSERT INTO documents (text, embedding) VALUES (%s, %s);",
(chunk, embedding)
)
conn.commit()
To use PGVector as part of your RAG pipeline, perform a similarity search in PostgreSQL to retrieve the most relevant document chunks and insert them into your prompt template. This ensures your model has the necessary context to generate accurate and well-informed responses.
Evaluating PGVector Retrieval
Evaluating your PGVector retriever consists of two steps:
- Preparing an
input
query along with the expected LLM response, and using theinput
to generate a response from your RAG pipeline to create anLLMTestCase
containing the input, actual output, expected output, and retrieval context. - Evaluating the test case using a selection of retrieval metrics.
An LLMTestCase
allows you to create unit tests for your LLM applications, helping you identify specific weaknesses in your RAG application.
Preparing your Test Case
Since the first step in generating a response from your RAG pipeline is retrieving the relevant retrieval_context
from your PGVector table, perform this retrieval for your input
query.
def search(query, top_k=3):
query_embedding = model.encode(query).tolist()
cursor.execute("""
SELECT text FROM documents
ORDER BY embedding <-> %s -- Use <-> for cosine similarity
LIMIT %s;
""", (query_embedding, top_k))
return [row[0] for row in cursor.fetchall()]
query = "How does PGVector work?"
retrieval_context = search(query)
Next, pass the retrieved context into your LLM's prompt template to generate a response.
prompt = """
Answer the user question based on the supporting context
User Question:
{input}
Supporting Context:
{retrieval_context}
"""
actual_output = generate(prompt) # hypothetical function, replace with your own LLM
print(actual_output)
Let's examine the actual_output
generated by our RAG pipeline:
PGVector enables efficient vector search within PostgreSQL for AI applications.
Finally, create an LLMTestCase
using the input and expected output you prepared, along with the actual output and retrieval context you generated.
from deepeval.test_case import LLMTestCase
test_case = LLMTestCase(
input=input,
actual_output=actual_output,
retrieval_context=retrieval_context,
expected_output="PGVector is an extension that brings efficient vector search capabilities to PostgreSQL.",
)
Running Evaluations
To run evaluations on the LLMTestCase
, we first need to define relevant deepeval
metrics to evaluate the PGVector retriever: contextual recall, contextual precision, and contextual relevancy.
These contextual metrics help assess your retriever. For more retriever evaluation details, check out this guide.
from deepeval.metrics import (
ContextualRecallMetric,
ContextualPrecisionMetric,
ContextualRelevancyMetric,
)
contextual_recall = ContextualRecallMetric(),
contextual_precision = ContextualPrecisionMetric()
contextual_relevancy = ontextualRelevancyMetric()
Finally, pass the test case and metrics into the evaluate
function to begin the evaluation.
from deepeval import evaluate
evaluate(
[test_case],
metrics=[contextual_recall, contextual_precision, contextual_relevancy]
)
Improving PGVector Retrieval
Below is a table outlining the hypothetical metric scores for your evaluation run.
Metric | Score |
---|---|
Contextual Precision | 0.85 |
Contextual Recall | 0.92 |
Contextual Relevancy | 0.44 |
Each contextual metric evaluates a specific hyperparameter. To learn more about this, read this guide on RAG evaluation.
To improve your PGVector retriever, you'll need to experiment with various hyperparameters and prepare LLMTestCase
s using generations from different retriever versions.
Ultimately, analyzing improvements and regressions in contextual metric scores (the three metrics defined above) will help you determine the optimal hyperparameter combination for your PGVector retriever.
For a more detailed guide on tuning your retriever’s hyperparameters, check out this guide.