RAG Development with Weaviate Vector Database

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
RAG Development with Weaviate Vector Database
Medium
from 1 week to 3 months
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

RAG Development with Weaviate Vector Database

Weaviate is an open-source vector database with GraphQL/REST API, modular architecture, and built-in support for multiple search formats (vector, BM25, hybrid). Distinctive features: native integration modules with embedding providers (OpenAI, Cohere, HuggingFace), GraphQL for complex queries, and rich schema for metadata.

Installation and Initialization

import weaviate
import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType

# Connect to local Weaviate
client = weaviate.connect_to_local(
    host="localhost",
    port=8080,
    grpc_port=50051,
)

# Or to Weaviate Cloud
client = weaviate.connect_to_wcs(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=weaviate.auth.AuthApiKey("..."),
)

Creating Collection Schema

client.collections.create(
    name="KnowledgeBase",
    vectorizer_config=Configure.Vectorizer.text2vec_openai(
        model="text-embedding-3-large",
        dimensions=3072,
    ),
    generative_config=Configure.Generative.openai(model="gpt-4o"),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="source", data_type=DataType.TEXT),
        Property(name="doc_type", data_type=DataType.TEXT),
        Property(name="page_number", data_type=DataType.INT),
        Property(name="date", data_type=DataType.DATE),
        Property(name="department", data_type=DataType.TEXT),
    ],
)

Weaviate automatically vectorizes text through the specified module — no need to manually call embedding API during indexing.

Document Indexing

collection = client.collections.get("KnowledgeBase")

# Batch loading
with collection.batch.dynamic() as batch:
    for chunk in document_chunks:
        batch.add_object(
            properties={
                "content": chunk.page_content,
                "source": chunk.metadata["source"],
                "doc_type": chunk.metadata.get("doc_type", "general"),
                "page_number": chunk.metadata.get("page", 0),
                "department": chunk.metadata.get("department", ""),
            }
        )

Search Types in Weaviate

Vector search (near_text):

results = collection.query.near_text(
    query="contract approval procedure",
    limit=5,
    return_metadata=wvc.query.MetadataQuery(score=True, distance=True),
    filters=wvc.query.Filter.by_property("doc_type").equal("contract"),
)

BM25 search:

results = collection.query.bm25(
    query="rental agreement approval",
    limit=5,
    query_properties=["content"],  # Fields for BM25
)

Hybrid search:

results = collection.query.hybrid(
    query="approval procedure",
    alpha=0.75,   # 0=BM25, 1=vector
    limit=5,
    fusion_type=wvc.query.HybridFusion.RELATIVE_SCORE,  # or RANKED
)

Generative Search (RAG via Weaviate)

Weaviate can perform RAG directly through Generative module:

# Built-in RAG — retrieval + generation in single query
response = collection.generate.near_text(
    query="What is the approval process for procurement?",
    limit=3,
    single_prompt="Based on the following document answer the question: {content}\n\nQuestion: What is the approval process for procurement?",
    grouped_task="Summarize key steps of the procurement approval procedure based on provided documents.",
)

print(response.generated)  # LLM response

Practical Case: RAG for Law Firm

Task: assistant for lawyers on Russian legislation — search in laws, judicial practice, internal guidelines.

Volume: 28,000 documents (~4.2M chunks with 300 token size).

Weaviate Configuration:

  • Self-hosted on k8s (3 replicas)
  • text2vec-openai (text-embedding-3-large, dimension=3072)
  • Hybrid search, alpha=0.65 (slightly more weight to dense)

RAGAS Results:

Metric Dense only Hybrid (α=0.65) Hybrid + rerank
Context Precision 0.71 0.82 0.89
Context Recall 0.74 0.81 0.84
Faithfulness 0.79 0.88 0.92

Hybrid search gave +12% to precision compared to pure dense, especially for queries with exact terms (article numbers, specific legal constructs that embedding models poorly differentiate).

Multi-Tenancy in Weaviate

For SaaS products or data isolation between clients:

# Create collection with multitenancy
client.collections.create(
    name="ClientDocs",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    ...
)

# Create tenant
collection = client.collections.get("ClientDocs")
collection.tenants.create([wvc.tenants.Tenant(name="client_001")])

# Query in context of specific tenant
tenant_collection = collection.with_tenant("client_001")
results = tenant_collection.query.hybrid(query="...", limit=5)

Timeline

  • Weaviate setup + schema: 2–3 days
  • Ingestion pipeline: 3–7 days
  • RAG pipeline with evaluation: 1–2 weeks
  • Multi-tenancy and production: 1–2 weeks
  • Total: 2–5 weeks