AI Model Request and Response Logging Setup

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Offered services

Showing 1 of 1 servicesAll 1566 services

Simple

from 1 business day to 3 business days

FAQ

AI Development Areas

Discuss your AI project

Free consultation — we'll show you how AI can solve your challenge

Get a quote

We'll estimate the budget and timeline for your AI project

AI Solution Development Stages

Latest works

B2B ADVANCE company website development
1218
Development of a web application for FEEDME
1161
Website development for BELFINGROUP
854
Development of an online store for the company FURNORO
1047
B2B Advance company logo design
561
Development of a web application for Enviok
825

Show more works

Setting up logging of AI model requests and responses

Logging all AI model requests and responses is the basis for debugging, auditing, quality analysis, and compliance. This is not trivial for LLM: large data volumes, PII in prompts, and high cardinality values.

Structured logging

import structlog
import json
from datetime import datetime

logger = structlog.get_logger()

def log_llm_request(
    request_id: str,
    user_id: str,
    model: str,
    messages: list[dict],
    response: str,
    usage: dict,
    latency_ms: float,
    error: str | None = None
):
    # PII-фильтрация перед логированием
    safe_messages = pii_filter(messages)

    logger.info(
        "llm_request",
        request_id=request_id,
        user_id=hash_user_id(user_id),  # псевдонимизация
        model=model,
        prompt_tokens=usage.get("prompt_tokens"),
        completion_tokens=usage.get("completion_tokens"),
        total_tokens=usage.get("total_tokens"),
        cost_usd=calculate_cost(model, usage),
        latency_ms=latency_ms,
        has_error=error is not None,
        error_type=type(error).__name__ if error else None,
        # Полные сообщения только если включено детальное логирование
        messages=safe_messages if DETAILED_LOGGING else None,
        response_preview=response[:200] if response else None,  # первые 200 символов
    )

Storing full logs (for auditing)

For compliance, complete requests/responses are required. Separate from standard logs, this requires a large volume:

class LLMRequestStore:
    def __init__(self, s3_bucket: str, retention_days: int = 90):
        self.s3 = boto3.client("s3")
        self.bucket = s3_bucket
        self.retention_days = retention_days

    def store(self, request: LLMRequest) -> str:
        # Путь: year/month/day/hour/request_id.json.gz
        key = f"llm-logs/{datetime.utcnow().strftime('%Y/%m/%d/%H')}/{request.id}.json.gz"

        data = gzip.compress(json.dumps(request.to_dict()).encode())
        self.s3.put_object(
            Bucket=self.bucket,
            Key=key,
            Body=data,
            ContentEncoding="gzip",
        )
        return key

Protecting personal data in logs

import re

class PIIFilter:
    PATTERNS = [
        (r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b', '[CARD_NUMBER]'),
        (r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]'),
        (r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]'),
        (r'\+?[\d\s\-\(\)]{10,15}', '[PHONE]'),
    ]

    def filter(self, text: str) -> str:
        for pattern, replacement in self.PATTERNS:
            text = re.sub(pattern, replacement, text)
        return text

Retention and archiving

Hot logs (< 7 days): Elasticsearch or ClickHouse for fast search. Warm logs (7–30 days): S3 Standard. Cold logs (30–365 days): S3 Glacier Instant Retrieval. Delete after retention_days via S3 Lifecycle rules.