RAG Development with Weaviate Vector Database
Weaviate is an open-source vector database with GraphQL/REST API, modular architecture, and built-in support for multiple search formats (vector, BM25, hybrid). Distinctive features: native integration modules with embedding providers (OpenAI, Cohere, HuggingFace), GraphQL for complex queries, and rich schema for metadata.
Installation and Initialization
import weaviate
import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType
# Connect to local Weaviate
client = weaviate.connect_to_local(
host="localhost",
port=8080,
grpc_port=50051,
)
# Or to Weaviate Cloud
client = weaviate.connect_to_wcs(
cluster_url="https://your-cluster.weaviate.network",
auth_credentials=weaviate.auth.AuthApiKey("..."),
)
Creating Collection Schema
client.collections.create(
name="KnowledgeBase",
vectorizer_config=Configure.Vectorizer.text2vec_openai(
model="text-embedding-3-large",
dimensions=3072,
),
generative_config=Configure.Generative.openai(model="gpt-4o"),
properties=[
Property(name="content", data_type=DataType.TEXT),
Property(name="source", data_type=DataType.TEXT),
Property(name="doc_type", data_type=DataType.TEXT),
Property(name="page_number", data_type=DataType.INT),
Property(name="date", data_type=DataType.DATE),
Property(name="department", data_type=DataType.TEXT),
],
)
Weaviate automatically vectorizes text through the specified module — no need to manually call embedding API during indexing.
Document Indexing
collection = client.collections.get("KnowledgeBase")
# Batch loading
with collection.batch.dynamic() as batch:
for chunk in document_chunks:
batch.add_object(
properties={
"content": chunk.page_content,
"source": chunk.metadata["source"],
"doc_type": chunk.metadata.get("doc_type", "general"),
"page_number": chunk.metadata.get("page", 0),
"department": chunk.metadata.get("department", ""),
}
)
Search Types in Weaviate
Vector search (near_text):
results = collection.query.near_text(
query="contract approval procedure",
limit=5,
return_metadata=wvc.query.MetadataQuery(score=True, distance=True),
filters=wvc.query.Filter.by_property("doc_type").equal("contract"),
)
BM25 search:
results = collection.query.bm25(
query="rental agreement approval",
limit=5,
query_properties=["content"], # Fields for BM25
)
Hybrid search:
results = collection.query.hybrid(
query="approval procedure",
alpha=0.75, # 0=BM25, 1=vector
limit=5,
fusion_type=wvc.query.HybridFusion.RELATIVE_SCORE, # or RANKED
)
Generative Search (RAG via Weaviate)
Weaviate can perform RAG directly through Generative module:
# Built-in RAG — retrieval + generation in single query
response = collection.generate.near_text(
query="What is the approval process for procurement?",
limit=3,
single_prompt="Based on the following document answer the question: {content}\n\nQuestion: What is the approval process for procurement?",
grouped_task="Summarize key steps of the procurement approval procedure based on provided documents.",
)
print(response.generated) # LLM response
Practical Case: RAG for Law Firm
Task: assistant for lawyers on Russian legislation — search in laws, judicial practice, internal guidelines.
Volume: 28,000 documents (~4.2M chunks with 300 token size).
Weaviate Configuration:
- Self-hosted on k8s (3 replicas)
- text2vec-openai (text-embedding-3-large, dimension=3072)
- Hybrid search, alpha=0.65 (slightly more weight to dense)
RAGAS Results:
| Metric | Dense only | Hybrid (α=0.65) | Hybrid + rerank |
|---|---|---|---|
| Context Precision | 0.71 | 0.82 | 0.89 |
| Context Recall | 0.74 | 0.81 | 0.84 |
| Faithfulness | 0.79 | 0.88 | 0.92 |
Hybrid search gave +12% to precision compared to pure dense, especially for queries with exact terms (article numbers, specific legal constructs that embedding models poorly differentiate).
Multi-Tenancy in Weaviate
For SaaS products or data isolation between clients:
# Create collection with multitenancy
client.collections.create(
name="ClientDocs",
multi_tenancy_config=Configure.multi_tenancy(enabled=True),
...
)
# Create tenant
collection = client.collections.get("ClientDocs")
collection.tenants.create([wvc.tenants.Tenant(name="client_001")])
# Query in context of specific tenant
tenant_collection = collection.with_tenant("client_001")
results = tenant_collection.query.hybrid(query="...", limit=5)
Timeline
- Weaviate setup + schema: 2–3 days
- Ingestion pipeline: 3–7 days
- RAG pipeline with evaluation: 1–2 weeks
- Multi-tenancy and production: 1–2 weeks
- Total: 2–5 weeks







