Graph RAG Implementation (Knowledge Graph Retrieval)

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
Graph RAG Implementation (Knowledge Graph Retrieval)
Complex
from 2 weeks to 3 months
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

Implementing Graph RAG (Knowledge Graph Extraction)

Graph RAG is an architecture extending standard vector RAG with knowledge graph structure. Instead of searching only by semantically close chunks, the system can traverse the graph: from an entity through relations, finding related concepts that don't contain query keywords but are semantically relevant. Microsoft Research published GraphRAG in 2024—the most influential implementation of this approach.

When Graph RAG Is Needed

Standard RAG fails with:

  • Questions about relationships between entities ("How are company X and contract Y related?")
  • Global summarizing questions ("What are the main topics in the document corpus?")
  • Multi-hop reasoning ("Who is the head of the department responsible for contract №123?")
  • Time-series tracking

Microsoft GraphRAG Architecture

Documents
    ↓
LLM extracts entities and relations
    ↓
Knowledge Graph (NetworkX/Neo4j)
    ↓
Hierarchical community detection (Leiden algorithm)
    ↓
Community summaries → Community reports
    ↓
Two search modes:
├── Local search: vector + graph-traversal from point
└── Global search: community reports summarization

Extracting Entities and Relations via LLM

from openai import OpenAI
import json

client = OpenAI()

ENTITY_EXTRACTION_PROMPT = """Extract entities and relations from the following text.
Return JSON:
{{
  "entities": [
    {{"id": "1", "name": "...", "type": "PERSON|ORG|CONTRACT|REGULATION|CONCEPT", "description": "..."}}
  ],
  "relationships": [
    {{"source": "id1", "target": "id2", "relation": "SIGNED|MANAGES|REFERS_TO|PART_OF", "description": "..."}}
  ]
}}

Text:
{text}"""

def extract_graph_elements(text: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": ENTITY_EXTRACTION_PROMPT.format(text=text)}],
        response_format={"type": "json_object"},
        temperature=0,
    )
    return json.loads(response.choices[0].message.content)

Building Knowledge Graph with NetworkX

import networkx as nx
from typing import List

class KnowledgeGraph:
    def __init__(self):
        self.graph = nx.DiGraph()
        self.entity_embeddings = {}

    def add_elements(self, elements: dict, source_doc: str):
        # Add entities
        for entity in elements["entities"]:
            self.graph.add_node(
                entity["id"],
                name=entity["name"],
                type=entity["type"],
                description=entity["description"],
                source=source_doc,
            )

        # Add relations
        for rel in elements["relationships"]:
            self.graph.add_edge(
                rel["source"],
                rel["target"],
                relation=rel["relation"],
                description=rel["description"],
            )

    def get_subgraph(self, entity_id: str, depth: int = 2) -> nx.DiGraph:
        """Returns subgraph around entity"""
        nodes = {entity_id}
        for _ in range(depth):
            neighbors = set()
            for node in nodes:
                neighbors.update(self.graph.predecessors(node))
                neighbors.update(self.graph.successors(node))
            nodes.update(neighbors)
        return self.graph.subgraph(nodes)

    def serialize_subgraph(self, subgraph: nx.DiGraph) -> str:
        """Converts subgraph to text for LLM context"""
        lines = []
        for node in subgraph.nodes(data=True):
            lines.append(f"Entity: {node[1].get('name')} ({node[1].get('type')})")
            lines.append(f"  Description: {node[1].get('description', '')}")

        for edge in subgraph.edges(data=True):
            source_name = subgraph.nodes[edge[0]].get("name", edge[0])
            target_name = subgraph.nodes[edge[1]].get("name", edge[1])
            lines.append(f"Relation: {source_name} → {target_name} ({edge[2].get('relation')})")
            lines.append(f"  {edge[2].get('description', '')}")

        return "\n".join(lines)

Local Search: GraphRAG Query

from langchain_openai import OpenAIEmbeddings
import numpy as np

class GraphRAGRetriever:
    def __init__(self, knowledge_graph: KnowledgeGraph, vectorstore, embeddings):
        self.kg = knowledge_graph
        self.vectorstore = vectorstore
        self.embeddings = embeddings

    def local_search(self, query: str, top_k: int = 5) -> str:
        """
        Local Search: combines vector search
        with graph-traversal from found entities
        """
        # 1. Vector search chunks
        vector_docs = self.vectorstore.similarity_search(query, k=top_k)

        # 2. Extract entities from found chunks
        mentioned_entities = self._extract_entities_from_docs(vector_docs, query)

        # 3. Graph traversal: expand context through related nodes
        graph_contexts = []
        for entity_id in mentioned_entities[:3]:
            subgraph = self.kg.get_subgraph(entity_id, depth=2)
            graph_context = self.kg.serialize_subgraph(subgraph)
            graph_contexts.append(graph_context)

        # 4. Combine text and graph context
        vector_context = "\n\n".join([d.page_content for d in vector_docs])
        graph_context = "\n\n".join(graph_contexts)

        return f"## Text Context\n{vector_context}\n\n## Knowledge Graph Context\n{graph_context}"

Practical Case: Corporate Documentation Analysis

Task: assistant for legal department analyzing relationships between contractors, contracts and employees (6500 contracts, 12 years history).

Questions not solved by standard RAG:

  • "Which suppliers participated in tenders where the winner was later recognized bankrupt?"
  • "Which contracts will be affected by leadership change in company X?"

Graph: 45,000 entities, 180,000 relations (Neo4j).

Results:

  • Multi-hop questions (2+ hops): solved in 12% standard RAG → 71% Graph RAG
  • Global summarizing questions: 34% → 82%
  • Standard questions (fact search): comparable, minor regression (-3%)
  • Graph building time: 4 days (GPT-4o for extraction, $240)

Tools for Graph RAG

  • Microsoft GraphRAG library: pip install graphrag — complete implementation from Microsoft
  • Neo4j + LangChain: Neo4jGraph + GraphCypherQAChain for Cypher queries
  • LlamaIndex + Knowledge Graph: KnowledgeGraphIndex
  • NetworkX: lightweight graph in Python without external dependencies

Timelines

  • Developing extraction pipeline (LLM → graph): 2–3 weeks
  • Building graph from existing documents: 1–4 weeks
  • Local/Global search implementation: 2 weeks
  • Testing and evaluation: 1–2 weeks
  • Total: 6–11 weeks