Building RAG with OpenSearch Vector Database
OpenSearch is a fork of Elasticsearch from AWS, evolving as an independent open-source project under Apache 2.0. It supports k-NN search through the knn-plugin with HNSW, IVF, and FAISS algorithms. If your infrastructure is built on AWS (Amazon OpenSearch Service), this is a priority choice for RAG.
Creating a k-NN Index
from opensearchpy import OpenSearch
from opensearchpy.helpers import bulk
client = OpenSearch(
hosts=[{"host": "localhost", "port": 9200}],
use_ssl=False,
)
# k-NN index configuration
index_config = {
"settings": {
"index.knn": True,
"index.knn.space_type": "cosinesimil",
},
"mappings": {
"properties": {
"content": {
"type": "text",
"analyzer": "standard",
},
"source": {"type": "keyword"},
"doc_type": {"type": "keyword"},
"embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"engine": "nmslib", # or "faiss"
"parameters": {
"m": 16,
"ef_construction": 128,
}
}
}
}
}
}
client.indices.create(index="knowledge_base", body=index_config)
Hybrid Search in OpenSearch
def opensearch_hybrid_search(query: str, top_k: int = 5) -> list:
query_embedding = get_embedding(query)
body = {
"query": {
"bool": {
"should": [
# BM25 search
{
"match": {
"content": {
"query": query,
"boost": 0.3
}
}
},
# k-NN search via script_score
{
"script_score": {
"query": {"match_all": {}},
"script": {
"source": "knn_score",
"lang": "knn",
"params": {
"field": "embedding",
"query_value": query_embedding,
"space_type": "cosinesimil",
}
},
"boost": 0.7,
}
}
]
}
},
"size": top_k,
"_source": ["content", "source", "doc_type"],
}
response = client.search(index="knowledge_base", body=body)
return [hit["_source"] for hit in response["hits"]["hits"]]
Amazon OpenSearch Service: Managed Variant
When deploying on AWS, we use Amazon OpenSearch Service with native Bedrock integration:
import boto3
import json
# Amazon OpenSearch Serverless
bedrock_client = boto3.client("bedrock-runtime", region_name="us-east-1")
def get_embedding_bedrock(text: str) -> list:
response = bedrock_client.invoke_model(
modelId="amazon.titan-embed-text-v2:0",
body=json.dumps({"inputText": text, "dimensions": 1024}),
)
return json.loads(response["body"].read())["embedding"]
Comparison with Elasticsearch
OpenSearch and Elasticsearch have nearly identical API for k-NN, but differences exist:
| Parameter | OpenSearch | Elasticsearch |
|---|---|---|
| License | Apache 2.0 | SSPL/Elastic License |
| AWS managed | Amazon OpenSearch Service | Elastic Cloud on AWS |
| k-NN engines | NMSLIB, FAISS, Lucene | Lucene HNSW |
| RRF fusion | Via scoring | Native (8.14+) |
| ML Commons | Built-in | No equivalent |
OpenSearch ML Commons allows embedding models to be integrated directly into the cluster:
# Register and deploy embedding model inside OpenSearch
# Enables semantic search without external embedding API
body = {
"name": "huggingface/sentence-transformers/paraphrase-multilingual-mpnet-base-v2",
"version": "1.0.1",
"model_format": "TORCH_SCRIPT",
}
client.transport.perform_request("POST", "/_plugins/_ml/models/_register", body=body)
Timelines
- OpenSearch setup + index: 2–3 days
- Ingestion pipeline: 3–7 days
- Hybrid search + RAG pipeline: 1–2 weeks
- Total: 2–4 weeks







