Translation Bot Implementation in Mobile Applications
A translation bot — is not just calling Translation API. Interesting scenarios start where you need to preserve dialog context, switch languages on the fly, and work offline. These details are what separate "translation that works" from "translation that works well."
Choosing Translation API
DeepL API. Best translation quality for European languages. Free tier — 500K characters/month. Supports formal/informal tone (formality parameter) — important for business content. Doesn't support Russian → other languages at Google/Yandex level.
Google Cloud Translation API. 100+ languages, high quality for Russian. v3 supports glossary — dictionary of terms that shouldn't be translated or should be translated in specific way. For medical, legal, brand names — mandatory function.
Yandex Translate API. Best results for Russian ↔ European languages. Language detection built-in. Good choice for apps with Russian-speaking audience.
LLM (GPT-4o / Claude). Contextual translation considering tone and style. Wins on specialized texts, idioms, humor. More expensive and slower than specialized APIs for simple translations.
Key Technical Tasks
Language detection. User enters text — bot should understand source language. Either explicit choice via picker, or auto-detect. Google Translation API returns detectedSourceLanguage in response. For Yandex — lang in response.
Auto-detect works well for long texts, poorly — for single word or two. Short requests: better to offer language choice explicitly.
Dialog context. If user translates series of related messages (dialog, document in parts) — LLM with translation history gives more consistent result than independent API calls. Names, pronouns, specific terms preserved in context.
Terminology glossary. Google Translation v3 Glossary API allows creating list of terms that model should not translate or translate in specific way:
from google.cloud import translate_v3
client = translate_v3.TranslationServiceClient()
# Create glossary from CSV: original_term,translation
glossary = client.create_glossary(
parent=f"projects/{project_id}/locations/us-central1",
glossary=translate_v3.Glossary(
name=glossary_name,
language_pair=translate_v3.Glossary.LanguageCodePair(
source_language_code="en",
target_language_code="ru"
),
input_config=translate_v3.GlossaryInputConfig(
gcs_source=translate_v3.GcsSource(input_uri=glossary_gcs_uri)
)
)
)
Offline Mode
For apps with users in unreliable internet zones — offline translation on device.
iOS. MLKit Translation from Google supports downloading language models for offline work. TranslateLanguage.allLanguages() — list of available languages. One language model weighs ~30MB.
import MLKitTranslate
let options = TranslatorOptions(
sourceLanguage: .russian,
targetLanguage: .english
)
let translator = Translator.translator(options: options)
// Check and download model
let conditions = ModelDownloadConditions(allowsCellularAccess: true)
translator.downloadModelIfNeeded(with: conditions) { error in
guard error == nil else { return }
translator.translate("Hello, world") { result, error in
print(result ?? "")
}
}
Android. Same via TranslatorOptions and Translator from com.google.mlkit:translate.
Offline models work in privacy mode — user text doesn't leave device.
Voice Input and Translation Narration
Logical translator addition: user speaks → bot translates → reads translation aloud.
STT for input language: native APIs or Whisper. TTS for translation language: AVSpeechSynthesizer on iOS supports AVSpeechSynthesisVoice(language: "fr-FR") — system voices for dozens of languages. Android TextToSpeech similarly via setLanguage(Locale("fr", "FR")).
Important: check if required voice exists on device before narration. AVSpeechSynthesisVoice.speechVoices() — list of available.
Camera: Real-Time Translation
Most impressive scenario — point camera at menu / sign / document and see translation overlaid on image. Technically: ML Kit Text Recognition (TextRecognizer) → translate text blocks → render over camera preview with OCR bounding boxes.
Gotcha: text coordinates from OCR tied to frame changing 30 times per second. Stabilizing results (compare with previous frame by bounding box IoU) reduces flicker.
Implementation Process
Choose Translation API for target languages and use cases.
Backend development: API keys, translation caching, glossary.
Mobile UI: text input, translation history, copy/share buttons.
Optional: offline models, voice input/output, camera translation.
Timeline Estimates
Basic translation bot via cloud API — 2–3 days. With offline mode, voice input and camera translation — 1.5–2 weeks.







