AI Assistant Development Based on YandexGPT for Mobile App

TRUETECH is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
AI Assistant Development Based on YandexGPT for Mobile App
Medium
~3-5 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    756
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    624
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1054
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    862
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    445

Building an AI Assistant with YandexGPT in a Mobile Application

YandexGPT is the practical choice for an AI assistant when requirements include data processing on servers in Russia, high-quality Russian language support, and integration with the Yandex ecosystem (search, maps, marketplace). For applications targeting the Russian market with strict data localization requirements, this is not just a preference—it's compliance.

Yandex Foundation Models API

YandexGPT is accessible through the Yandex Cloud Foundation Models API. Base URL: https://llm.api.cloud.yandex.net/foundationModels/v1/completion.

Authentication uses an IAM token (for user applications) or service account API key (for server proxies). IAM tokens live for 12 hours and require renewal—they are not used directly on mobile clients.

Request structure:

struct YandexGPTRequest: Encodable {
    let modelUri: String  // "gpt://{folder_id}/yandexgpt/latest"
    let completionOptions: CompletionOptions
    let messages: [YandexMessage]
}

struct CompletionOptions: Encodable {
    let stream: Bool
    let temperature: Double  // 0..1
    let maxTokens: String    // string, not number—API quirk
}

Important: maxTokens is passed as a string, not a number. This violates the principle of least surprise and periodically breaks auto-generated clients.

modelUri is constructed as gpt://{folder_id}/{model_name}/{version}. The folder_id is the Yandex Cloud folder identifier and must be stored on the server, not in the app.

Synchronous and Asynchronous Modes

YandexGPT supports two modes:

  • synchronous (/completion) — wait for full response, maximum 60 seconds
  • asynchronous (/completionAsync) — receive operation_id, then poll for result

For a mobile assistant with real-time display, you need the streaming mode (stream: true in synchronous request). The server returns chunked response with partial results. Each chunk is complete JSON with accumulated text (not a delta, but full text on each step). This is important: when rendering, replace the previous text with the new one, don't append like in OpenAI.

// Each chunk contains FULL text, not a delta
// Correct rendering:
func handleChunk(_ response: YandexCompletionResponse) {
    let fullText = response.result.alternatives.first?.message.text ?? ""
    DispatchQueue.main.async {
        self.currentMessage = fullText  // replace, not append
    }
}

YandexGPT Lite vs Pro

Parameter YandexGPT Lite YandexGPT Pro
Response quality Basic Higher, especially on long instructions
Speed Faster Slower
Cost Cheaper More expensive
Context 8192 tokens 8192 tokens

For most mobile assistant tasks (helper, FAQ, text processing), Lite is sufficient. Pro is justified for complex analytical tasks and working with long documents.

Embeddings API (/textEmbedding) is useful for semantic search in a local knowledge base—model text-search-query/latest for queries, text-search-doc/latest for documents.

Integration with Yandex SpeechKit

For voice input/output in a Russian application—Yandex SpeechKit: the best quality Russian speech among available market services. SDK for iOS and Android is available through CocoaPods/Maven.

STT via WebSocket: wss://stt.api.cloud.yandex.net/speech/v3/stt:streamingRecognize—streaming recognition with partial results. TTS via REST with voice selection (alena, filipp, jane—SSML is supported).

Workflow

Start: configure Yandex Cloud account, create service account, assign ai.languageModels.user role, set up server proxy for secure credentials storage.

Development: API client → streaming UI accounting for full text in chunks → history management → optional SpeechKit integration.

Timeline Estimates

Text assistant with streaming—1–2 weeks. With voice via SpeechKit and server proxy—3–4 weeks.