OpenAI Structured Outputs Integration for Response Parsing

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
OpenAI Structured Outputs Integration for Response Parsing
Simple
~1 business day
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

OpenAI Structured Outputs Integration for Response Parsing

Structured Outputs guarantees that the model's response exactly matches the specified JSON schema. Unlike response_format: json_object (which simply requests JSON return), Structured Outputs ensures compliance with a specific schema through constrained decoding — the model physically cannot return invalid JSON.

Basic Integration with Pydantic

from openai import OpenAI
from pydantic import BaseModel
from typing import Literal, Optional

client = OpenAI()

# Schema for data extraction
class Invoice(BaseModel):
    vendor_name: str
    invoice_number: str
    date: str
    total_amount: float
    currency: str
    line_items: list["InvoiceItem"]
    vat_amount: Optional[float] = None

class InvoiceItem(BaseModel):
    description: str
    quantity: float
    unit_price: float
    total: float

Invoice.model_rebuild()  # Required for forward references

# Parsing — guaranteed schema compliance
def extract_invoice(text: str) -> Invoice:
    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Extract invoice data from text"},
            {"role": "user", "content": text}
        ],
        response_format=Invoice,
    )
    return response.choices[0].message.parsed  # Directly a Pydantic object

Classification with Enum

from enum import Enum

class TicketCategory(str, Enum):
    technical = "technical"
    billing = "billing"
    feature_request = "feature_request"
    complaint = "complaint"
    general = "general"

class TicketClassification(BaseModel):
    category: TicketCategory
    priority: Literal["low", "medium", "high", "critical"]
    sentiment: Literal["positive", "neutral", "negative", "angry"]
    requires_human: bool
    summary: str
    tags: list[str]

def classify_ticket(text: str) -> TicketClassification:
    response = client.beta.chat.completions.parse(
        model="gpt-4o-mini",  # Structured Outputs available in mini too
        messages=[{"role": "user", "content": f"Classify ticket: {text}"}],
        response_format=TicketClassification,
        temperature=0,
    )
    return response.choices[0].message.parsed

Structured Outputs via JSON Schema (without Pydantic)

# For languages without Pydantic or complex schemas
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Product data"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "product_data",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "price": {"type": "number"},
                    "in_stock": {"type": "boolean"},
                    "categories": {
                        "type": "array",
                        "items": {"type": "string"}
                    }
                },
                "required": ["name", "price", "in_stock", "categories"],
                "additionalProperties": False,
            }
        }
    }
)
import json
data = json.loads(response.choices[0].message.content)

Structured Outputs Limitations

  • strict: True requires additionalProperties: False at all levels
  • Not supported: nullable fields via "type": ["string", "null"] (use anyOf)
  • Maximum nesting depth: 5 levels
  • For recursive schemas — use $ref

When to Use

Scenario Method
Data extraction from documents Structured Outputs
Classification Structured Outputs
Responses with predictable structure Structured Outputs
Free-form JSON (unknown structure) json_object mode
Simple responses Plain text

Timeline

  • Basic extraction with Pydantic: 0.5-1 day
  • Complex nested schemas: 1-2 days