RAG Development with pgvector Vector Database (PostgreSQL)
pgvector is a PostgreSQL extension that adds a vector data type and vector search operations. If your primary data is already in PostgreSQL, pgvector allows you to implement RAG without introducing a separate vector database. Suitable for moderate volumes (up to 1–5M vectors) and teams that don't want to add a new infrastructure component.
Installing pgvector
-- Install extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Table for document chunks
CREATE TABLE document_chunks (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
source VARCHAR(512),
doc_type VARCHAR(64),
page_number INTEGER DEFAULT 0,
metadata JSONB,
embedding vector(1536), -- dimension = embedding model
created_at TIMESTAMP DEFAULT NOW()
);
-- HNSW index for fast search
CREATE INDEX ON document_chunks USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Indexing via Python
import psycopg2
from openai import OpenAI
import json
conn = psycopg2.connect("postgresql://user:pass@localhost:5432/ragdb")
openai_client = OpenAI()
def index_chunk(text: str, source: str, doc_type: str, metadata: dict):
# Get embedding
response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=text,
)
embedding = response.data[0].embedding
with conn.cursor() as cur:
cur.execute("""
INSERT INTO document_chunks (content, source, doc_type, metadata, embedding)
VALUES (%s, %s, %s, %s, %s)
""", (text, source, doc_type, json.dumps(metadata), embedding))
conn.commit()
Vector Search with Filtering
def search_similar(query: str, doc_type: str = None, limit: int = 5) -> list:
query_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=query,
).data[0].embedding
sql = """
SELECT content, source, doc_type, metadata,
1 - (embedding <=> %s::vector) AS similarity
FROM document_chunks
WHERE ($2::text IS NULL OR doc_type = $2)
ORDER BY embedding <=> %s::vector
LIMIT %s
"""
with conn.cursor() as cur:
cur.execute(sql, (query_embedding, doc_type, query_embedding, limit))
results = cur.fetchall()
return [
{"text": r[0], "source": r[1], "similarity": r[4]}
for r in results
]
pgvector operators:
-
<=>— cosine distance -
<->— euclidean distance -
<#>— inner product (negative)
Timeline
- Setting up pgvector + table: 1 day
- Ingestion pipeline: 2–4 days
- RAG pipeline: 3–5 days
- Total: 1–2 weeks







