YandexGPT API Integration

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Offered services

Showing 1 of 1 servicesAll 1566 services

Simple

~1 business day

FAQ

AI Development Areas

Discuss your AI project

Free consultation — we'll show you how AI can solve your challenge

Get a quote

We'll estimate the budget and timeline for your AI project

AI Solution Development Stages

Latest works

B2B ADVANCE company website development
1240
Development of a web application for FEEDME
1167
Website development for BELFINGROUP
867
Development of an online store for the company FURNORO
1084
B2B Advance company logo design
563
Development of a web application for Enviok
829

Show more works

YandexGPT API Integration

YandexGPT is an LLM from Yandex via Yandex Cloud (Yandex Foundation Models service). Key advantages for the Russian market: data is processed in Russia (152-ФЗ compliance), integration with other Yandex Cloud services, native-level Russian language support.

Access Setup

# Required:
# 1. Yandex Cloud account with payment account
# 2. Folder (folder_id)
# 3. IAM token or service account API key

import requests
import json

FOLDER_ID = "your-folder-id"
IAM_TOKEN = "your-iam-token"  # Updated every 12 hours
# Or API_KEY for service account

Synchronous Request via REST API

def yandexgpt_chat(
    prompt: str,
    model: str = "yandexgpt",
    temperature: float = 0.1,
    max_tokens: int = 2000,
) -> str:
    url = "https://llm.api.cloud.yandex.net/foundationModels/v1/completion"

    headers = {
        "Authorization": f"Api-Key {API_KEY}",
        "x-folder-id": FOLDER_ID,
    }

    body = {
        "modelUri": f"gpt://{FOLDER_ID}/{model}",
        "completionOptions": {
            "stream": False,
            "temperature": temperature,
            "maxTokens": max_tokens,
        },
        "messages": [
            {"role": "user", "text": prompt}
        ]
    }

    response = requests.post(url, headers=headers, json=body)
    response.raise_for_status()

    return response.json()["result"]["alternatives"][0]["message"]["text"]

# With system prompt
def yandexgpt_with_system(system: str, user_prompt: str) -> str:
    url = "https://llm.api.cloud.yandex.net/foundationModels/v1/completion"

    body = {
        "modelUri": f"gpt://{FOLDER_ID}/yandexgpt",
        "completionOptions": {"stream": False, "temperature": 0.1, "maxTokens": 2000},
        "messages": [
            {"role": "system", "text": system},
            {"role": "user", "text": user_prompt}
        ]
    }

    response = requests.post(
        url,
        headers={"Authorization": f"Api-Key {API_KEY}", "x-folder-id": FOLDER_ID},
        json=body,
    )
    return response.json()["result"]["alternatives"][0]["message"]["text"]

Async via Official SDK

from yandex_cloud_ml_sdk import YCloudML

sdk = YCloudML(folder_id=FOLDER_ID, auth=API_KEY)
model = sdk.models.completions("yandexgpt")

# Synchronously
result = model.configure(temperature=0.5).run("Tell me about Moscow")

# Async
result = await model.configure(temperature=0.5).run_async("Request")

# Streaming
for event in model.configure(temperature=0.5).run_stream("Long request"):
    print(event.alternatives[0].text, end="")

YandexGPT Models

Model	Description	Context
yandexgpt	Main model, balance quality/speed	32K
yandexgpt-lite	Lightweight version, faster and cheaper	32K
yandexgpt-32k	Long context	32K

Yandex Embeddings

# text-search-doc — for document indexing
# text-search-query — for search queries

def get_yandex_embedding(text: str, embedding_type: str = "text-search-doc") -> list[float]:
    response = requests.post(
        "https://llm.api.cloud.yandex.net/foundationModels/v1/textEmbedding",
        headers={"Authorization": f"Api-Key {API_KEY}", "x-folder-id": FOLDER_ID},
        json={
            "modelUri": f"emb://{FOLDER_ID}/{embedding_type}",
            "text": text,
        }
    )
    return response.json()["embedding"]

Practical Case Study

Government enterprise with requirement to process data in Russia. System for automatic response to citizen requests. YandexGPT was chosen for:

Data doesn't leave Russia (152-ФЗ)
Integration with Yandex SpeechKit for voice input
Quality in Russian language

Timeline

Basic REST integration: 1–2 days
SDK integration with async/streaming: 2–3 days
Integration with other Yandex Cloud services: 1 week