RAG Development with ChromaDB Vector Database
ChromaDB is an open-source vector database focused on ease of use. It requires no external dependencies for local operation and supports in-memory and persistent modes. ChromaDB is the standard choice for prototyping RAG systems and small production deployments (up to several million documents).
Installation and Connection
import chromadb
from chromadb.utils import embedding_functions
# In-memory (for development and testing)
client = chromadb.EphemeralClient()
# Persistent (file storage)
client = chromadb.PersistentClient(path="./chroma_db")
# HTTP server (production)
client = chromadb.HttpClient(host="localhost", port=8000)
Collection Creation and Indexing
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
embedding_fn = OpenAIEmbeddingFunction(
api_key="...",
model_name="text-embedding-3-small"
)
collection = client.get_or_create_collection(
name="knowledge_base",
embedding_function=embedding_fn,
metadata={"hnsw:space": "cosine"} # Similarity metric
)
# Add documents
collection.add(
documents=["Text chunk 1", "Text chunk 2", ...],
metadatas=[
{"source": "contract.pdf", "page": 1, "doc_type": "contract"},
{"source": "faq.md", "page": 0, "doc_type": "faq"},
],
ids=["chunk_001", "chunk_002", ...]
)
RAG Query
from openai import OpenAI
openai_client = OpenAI()
def rag_answer(question: str, n_results: int = 4) -> str:
# Find relevant chunks
results = collection.query(
query_texts=[question],
n_results=n_results,
where={"doc_type": {"$in": ["contract", "regulation"]}}, # Filter
)
context = "\n\n".join(results["documents"][0])
# Generate answer
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer only based on the context."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
],
temperature=0,
)
return response.choices[0].message.content
answer = rag_answer("What is the contract duration?")
Timeline
- RAG prototype with ChromaDB: 2–5 days
- Production version with monitoring: 2–3 weeks







