AI Agent Development with Database Access

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Agent Development with Database Access
Medium
from 1 week to 3 months
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

AI Agent Development with Database Access

An AI agent with database access can independently query data, generate SQL queries based on natural language questions, and perform write operations with necessary validations. Text-to-SQL is a key technology that enables non-technical users to interact with data through dialogue.

Text-to-SQL: Converting Questions to SQL

from langchain_openai import ChatOpenAI
from langchain_community.utilities import SQLDatabase
from langchain_community.agent_toolkits import SQLDatabaseToolkit
from langchain.agents import create_sql_agent

# Connect to PostgreSQL
db = SQLDatabase.from_uri(
    "postgresql://user:password@localhost:5432/company_db",
    include_tables=["orders", "customers", "products", "inventory"],
    sample_rows_in_table_info=3,  # Include sample data in schema
)

llm = ChatOpenAI(model="gpt-4o", temperature=0)
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

# SQL agent with automatic error correction
agent = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=10,
)

# Query examples
result = agent.invoke({"input": "What are the top 5 customers by revenue in the last 3 months?"})
result = agent.invoke({"input": "Show products with less than 10 units in stock"})

Security: Read-Only Access

from langchain_community.utilities import SQLDatabase
from sqlalchemy import create_engine, text

# Read-only PostgreSQL user
READ_ONLY_USER_URI = "postgresql://readonly_user:pass@localhost:5432/db"

# Additional validation: prevent DML operations
def validate_sql_query(query: str) -> bool:
    """Check that the query is SELECT only"""
    forbidden_keywords = ["INSERT", "UPDATE", "DELETE", "DROP", "CREATE", "ALTER", "TRUNCATE"]
    query_upper = query.upper()
    for keyword in forbidden_keywords:
        if keyword in query_upper:
            return False
    return True

class SafeSQLTool:
    def __init__(self, db_uri: str):
        self.engine = create_engine(db_uri)

    def execute_query(self, query: str) -> str:
        if not validate_sql_query(query):
            return "ERROR: Only SELECT queries are allowed"

        # Limit number of rows
        if "LIMIT" not in query.upper():
            query = f"{query.rstrip(';')} LIMIT 100"

        with self.engine.connect() as conn:
            result = conn.execute(text(query))
            rows = result.fetchall()
            columns = result.keys()
            return str([dict(zip(columns, row)) for row in rows])

Database Schema with Context for LLM

Text-to-SQL quality depends critically on the quality of schema description:

SCHEMA_CONTEXT = """
Database tables:

1. orders
   - id: PK, INTEGER
   - customer_id: FK → customers.id
   - status: VARCHAR (pending, confirmed, shipped, delivered, cancelled)
   - total_amount: DECIMAL(12,2) — order amount in currency
   - created_at: TIMESTAMP
   - shipped_at: TIMESTAMP (NULL if not shipped)

2. customers
   - id: PK
   - name: VARCHAR — company name or full name
   - inn: VARCHAR(12) — tax ID for legal entity/individual entrepreneur
   - segment: VARCHAR (enterprise, mid, small) — customer segment
   - manager_id: FK → employees.id — responsible manager

3. products
   - sku: VARCHAR — article number
   - name: VARCHAR
   - category: VARCHAR
   - price_rub: DECIMAL
   - cost_rub: DECIMAL — cost price

IMPORTANT: Order statuses: 'delivered' = successfully completed. 'cancelled' = cancelled.
           Revenue = sum of total_amount for orders with 'delivered' status.
"""

system_prompt = f"""You are a data analyst. Convert questions to SQL queries.
Use the following database schema:

{SCHEMA_CONTEXT}

Rules:
- SELECT queries only
- Always add LIMIT (no more than 1000)
- Use aliases for readability
- When aggregating — add ORDER BY"""

Practical Use Case: BI Agent for E-commerce

Task: analytical assistant for a commercial director — sales analysis, ABC analysis of assortment, order funnel, cohort retention.

Database: PostgreSQL, 15 tables, 3M orders.

Dialog examples:

User: "What is the conversion from pending to delivered in the last 30 days by customer segment?"

Agent generates:

SELECT
    c.segment,
    COUNT(*) FILTER (WHERE o.status = 'pending') AS pending,
    COUNT(*) FILTER (WHERE o.status = 'delivered') AS delivered,
    ROUND(
        COUNT(*) FILTER (WHERE o.status = 'delivered')::decimal /
        NULLIF(COUNT(*), 0) * 100, 1
    ) AS conversion_pct
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.created_at >= NOW() - INTERVAL '30 days'
GROUP BY c.segment
ORDER BY conversion_pct DESC
LIMIT 100;

Results:

  • Time to get analytics: 2 days → 3 minutes
  • SQL accuracy (questions → correct SQL): 87%
  • Typical errors: incorrect JOINs in complex queries (resolved by few-shot examples in prompt)

Few-Shot Examples for Improved Accuracy

FEW_SHOT_EXAMPLES = """
Examples of correct queries:

Question: Top 10 products by margin in the last quarter
SQL:
SELECT p.name, p.sku,
       SUM(oi.quantity * (p.price_rub - p.cost_rub)) AS margin_rub
FROM order_items oi
JOIN products p ON oi.sku = p.sku
JOIN orders o ON oi.order_id = o.id
WHERE o.status = 'delivered'
  AND o.created_at >= DATE_TRUNC('quarter', CURRENT_DATE) - INTERVAL '3 months'
GROUP BY p.name, p.sku
ORDER BY margin_rub DESC
LIMIT 10;

Question: Average order value by month in 2025
SQL:
SELECT DATE_TRUNC('month', created_at) AS month,
       ROUND(AVG(total_amount), 0) AS avg_check,
       COUNT(*) AS order_count
FROM orders
WHERE status = 'delivered'
  AND created_at BETWEEN '2025-01-01' AND '2025-12-31'
GROUP BY 1
ORDER BY 1;
"""

Timeline

  • Develop Text-to-SQL agent: 2–3 weeks
  • Configure schema and few-shot examples: 1 week
  • Test on real queries: 1–2 weeks
  • Total: 4–6 weeks