AI Code Generation System Development

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1All 1566 services
AI Code Generation System Development
Complex
~2-4 weeks
Frequently Asked Questions

AI Development Areas

AI Solution Development Stages

Latest works

  • image_website-b2b-advance_0.webp
    B2B ADVANCE company website development
    1284
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1196
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    901
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1119
  • image_logo-advance_0.webp
    B2B Advance company logo design
    586
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    853

AI Code Generation System Development

An AI code generation system autonomously creates production-ready code from textual descriptions or specifications. It includes requirement understanding, generation based on existing codebase, test execution, and iterative improvement. Architecturally more complex than a single LLM call — requires code context management, result verification, and CI/CD integration.

System Architecture

Context Manager — collects relevant context: database schema, API interfaces, existing models, code style guide.

Generation Engine — LLM agent with tools for reading files, executing tests, searching the codebase.

Verification Layer — syntax checking, test execution, linting.

Feedback Loop — iterations based on test errors.

Code Generation Agent

from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from typing import TypedDict, Annotated, Optional
import subprocess
import ast
import operator

llm = ChatOpenAI(model="claude-opus-4-5", temperature=0.1)

class CodeGenState(TypedDict):
    task_description: str
    existing_code_context: str
    generated_code: Optional[str]
    test_results: Annotated[list, operator.add]
    iteration: int
    max_iterations: int
    errors: Annotated[list, operator.add]
    final_code: Optional[str]

@tool
def read_file(file_path: str) -> str:
    """Read a file from codebase for context."""
    try:
        with open(file_path) as f:
            return f.read()
    except FileNotFoundError:
        return f"File {file_path} not found"

@tool
def search_codebase(query: str, directory: str = "./src") -> str:
    """Search the codebase using grep to find similar code."""
    result = subprocess.run(
        ["grep", "-r", "--include=*.py", "-n", query, directory],
        capture_output=True, text=True
    )
    return result.stdout[:3000] if result.stdout else "Nothing found"

@tool
def run_python_syntax_check(code: str) -> str:
    """Check Python code syntax."""
    try:
        ast.parse(code)
        return "Syntax is correct"
    except SyntaxError as e:
        return f"Syntax error: {e}"

@tool
def run_tests(test_file_path: str) -> str:
    """Run pytest tests and return results."""
    result = subprocess.run(
        ["python", "-m", "pytest", test_file_path, "-v", "--tb=short"],
        capture_output=True, text=True, timeout=60
    )
    output = result.stdout + result.stderr
    return output[-3000:]  # Last 3000 chars

@tool
def write_file(file_path: str, content: str) -> str:
    """Write code to a file."""
    with open(file_path, "w", encoding="utf-8") as f:
        f.write(content)
    return f"File {file_path} written ({len(content)} characters)"

CODE_GEN_SYSTEM = """You are a Senior Software Engineer. Generate production-quality code.

Principles:
- Follow existing codebase patterns
- Write typed code (type hints)
- Each function — one level of abstraction
- Handle errors explicitly
- Minimal external dependencies if standard alternatives exist

Process:
1. Read existing code for context
2. Generate code matching that style
3. Check syntax
4. Run tests
5. Fix errors iteratively"""

from langgraph.prebuilt import create_react_agent

code_gen_agent = create_react_agent(
    llm.bind_tools([read_file, search_codebase, run_python_syntax_check, run_tests, write_file]),
    tools=[read_file, search_codebase, run_python_syntax_check, run_tests, write_file],
    state_modifier=CODE_GEN_SYSTEM,
)

Generation with Codebase Context

class ContextAwareCodeGenerator:

    def __init__(self, project_root: str):
        self.project_root = project_root
        self.context_cache = {}

    async def gather_context(self, task: str) -> str:
        """Gathers relevant context for a task"""

        # Find similar files through LLM
        relevant_files = await self.identify_relevant_files(task)

        context_parts = []

        # Read database schema
        if await self.file_exists("models.py"):
            models = await read_file_async(f"{self.project_root}/models.py")
            context_parts.append(f"## Data Models\n{models[:2000]}")

        # Read base classes and interfaces
        for file_path in relevant_files[:3]:
            content = await read_file_async(file_path)
            context_parts.append(f"## {file_path}\n{content[:1500]}")

        # Add code style guide
        if await self.file_exists(".codestyle.md"):
            style = await read_file_async(f"{self.project_root}/.codestyle.md")
            context_parts.append(f"## Code Style\n{style[:1000]}")

        return "\n\n".join(context_parts)

    async def generate(self, task: str, output_file: str) -> dict:
        context = await self.gather_context(task)

        result = await code_gen_agent.ainvoke({
            "messages": [{
                "role": "user",
                "content": f"""Task: {task}

Codebase context:
{context}

Output file: {output_file}

Generate code, verify it, and write to file."""
            }]
        })

        return {
            "task": task,
            "output_file": output_file,
            "iterations": result.get("iteration", 1),
            "tests_passed": self.extract_test_status(result),
        }

Template-based Generation with LLM Filling

class CRUDGenerator:
    """Generates CRUD modules from entity schema"""

    CRUD_TEMPLATE = """
# Module for working with entity {entity_name}
from sqlalchemy import Column, Integer, String, DateTime, func
from sqlalchemy.orm import Session
from pydantic import BaseModel
from typing import Optional, List
from datetime import datetime

# PLACEHOLDERS FOR LLM REPLACEMENT:
# COLUMNS - list of SQLAlchemy columns
# PYDANTIC_FIELDS - Pydantic schema fields
# BUSINESS_LOGIC - business-specific logic
"""

    async def generate_crud_module(self, entity_spec: dict) -> str:
        """entity_spec: {name, fields, business_rules, relationships}"""

        # LLM fills in specific parts
        columns = await self.generate_sqlalchemy_columns(entity_spec["fields"])
        schemas = await self.generate_pydantic_schemas(entity_spec["fields"])
        business_logic = await self.generate_business_logic(entity_spec.get("business_rules", []))

        # Assemble final module
        result = await llm.ainvoke(f"""Create a complete CRUD module for entity {entity_spec['name']}.

Specification: {json.dumps(entity_spec, ensure_ascii=False)}

Stack: FastAPI + SQLAlchemy 2.0 + Pydantic v2
Include: model, pydantic schemas, CRUD functions, FastAPI router with dependency injection
Code standards: async/await, type hints, docstrings""")

        return result.content

Practical Case: Automation in a FinTech Startup

Task: a team of 4 developers created 3–5 new API endpoints per week. Each CRUD endpoint with tests took 4–6 hours.

AI Code Generation:

  • CRUD module generation from OpenAPI specifications
  • Automatic pytest test generation for endpoints
  • Alembic migration generation on model changes
  • Code review suggestions through AI Code Review agent

Results:

  • Time to create standard CRUD endpoint: 5h → 50 min (15 min generation + 35 min review)
  • Test coverage of new endpoints: 45% → 82%
  • Code uniformity: significantly improved (all follow one pattern)
  • System call rate: 14% of PRs required substantial business logic rework

Timeline

  • Basic generator with context: 2–3 weeks
  • Agent cycle with tests and iterations: 2–3 weeks
  • CI/CD and IDE integration: 2–3 weeks
  • Total: 6–9 weeks