AI Model Request and Response Logging Setup

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Model Request and Response Logging Setup
Simple
from 1 business day to 3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1218
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    854
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1047
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    825

Setting up logging of AI model requests and responses

Logging all AI model requests and responses is the basis for debugging, auditing, quality analysis, and compliance. This is not trivial for LLM: large data volumes, PII in prompts, and high cardinality values.

Structured logging

import structlog
import json
from datetime import datetime

logger = structlog.get_logger()

def log_llm_request(
    request_id: str,
    user_id: str,
    model: str,
    messages: list[dict],
    response: str,
    usage: dict,
    latency_ms: float,
    error: str | None = None
):
    # PII-фильтрация перед логированием
    safe_messages = pii_filter(messages)

    logger.info(
        "llm_request",
        request_id=request_id,
        user_id=hash_user_id(user_id),  # псевдонимизация
        model=model,
        prompt_tokens=usage.get("prompt_tokens"),
        completion_tokens=usage.get("completion_tokens"),
        total_tokens=usage.get("total_tokens"),
        cost_usd=calculate_cost(model, usage),
        latency_ms=latency_ms,
        has_error=error is not None,
        error_type=type(error).__name__ if error else None,
        # Полные сообщения только если включено детальное логирование
        messages=safe_messages if DETAILED_LOGGING else None,
        response_preview=response[:200] if response else None,  # первые 200 символов
    )

Storing full logs (for auditing)

For compliance, complete requests/responses are required. Separate from standard logs, this requires a large volume:

class LLMRequestStore:
    def __init__(self, s3_bucket: str, retention_days: int = 90):
        self.s3 = boto3.client("s3")
        self.bucket = s3_bucket
        self.retention_days = retention_days

    def store(self, request: LLMRequest) -> str:
        # Путь: year/month/day/hour/request_id.json.gz
        key = f"llm-logs/{datetime.utcnow().strftime('%Y/%m/%d/%H')}/{request.id}.json.gz"

        data = gzip.compress(json.dumps(request.to_dict()).encode())
        self.s3.put_object(
            Bucket=self.bucket,
            Key=key,
            Body=data,
            ContentEncoding="gzip",
        )
        return key

Protecting personal data in logs

import re

class PIIFilter:
    PATTERNS = [
        (r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b', '[CARD_NUMBER]'),
        (r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]'),
        (r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]'),
        (r'\+?[\d\s\-\(\)]{10,15}', '[PHONE]'),
    ]

    def filter(self, text: str) -> str:
        for pattern, replacement in self.PATTERNS:
            text = re.sub(pattern, replacement, text)
        return text

Retention and archiving

Hot logs (< 7 days): Elasticsearch or ClickHouse for fast search. Warm logs (7–30 days): S3 Standard. Cold logs (30–365 days): S3 Glacier Instant Retrieval. Delete after retention_days via S3 Lifecycle rules.