Which translation APIs are best for a mobile app?

DeepL offers the best quality for European languages, Google covers 100+ languages with glossary support, Yandex is optimal for Russian. LLMs work well for contextual dialogue translation. Development cost: basic bot from $1,000, full-featured around $10,000.

How do I implement offline translation on the device?

Use Google's ML Kit for iOS and Android. Download language models (~30 MB each) and translate on the device without internet. The text never leaves the phone. Offline translation is 10x faster than API calls (10–50 ms vs 200–500 ms) and can save thousands per month for high-volume apps.

Can I translate voice and images?

Yes. Voice input uses STT (Whisper or native APIs) plus TTS. Image translation uses ML Kit Text Recognition to extract text, then translate blocks and overlay them on the camera preview. Accuracy: OCR is 95%+ in good light.

How do I preserve dialogue context in translation?

For simple cases, pass the last N messages to the translation API. For complex ones, use an LLM with conversation history for more coherent results. Context improves quality by 20–30% compared to stateless translation.

How long does it take to develop a translation bot?

A basic cloud-based bot takes 2-3 days. With offline, voice, and camera features, it takes 1.5-2 weeks. Exact timelines after project analysis. Pricing starts from $1,000 and scales with complexity.

Which translation APIs are best for a mobile app?

DeepL offers the best quality for European languages, Google covers 100+ languages with glossary support, Yandex is optimal for Russian. LLMs work well for contextual dialogue translation. Development cost: basic bot from $1,000, full-featured around $10,000.

How do I implement offline translation on the device?

Use Google's ML Kit for iOS and Android. Download language models (~30 MB each) and translate on the device without internet. The text never leaves the phone. Offline translation is 10x faster than API calls (10–50 ms vs 200–500 ms) and can save thousands per month for high-volume apps.

Can I translate voice and images?

Yes. Voice input uses STT (Whisper or native APIs) plus TTS. Image translation uses ML Kit Text Recognition to extract text, then translate blocks and overlay them on the camera preview. Accuracy: OCR is 95%+ in good light.

How do I preserve dialogue context in translation?

For simple cases, pass the last N messages to the translation API. For complex ones, use an LLM with conversation history for more coherent results. Context improves quality by 20–30% compared to stateless translation.

How long does it take to develop a translation bot?

A basic cloud-based bot takes 2-3 days. With offline, voice, and camera features, it takes 1.5-2 weeks. Exact timelines after project analysis. Pricing starts from $1,000 and scales with complexity.

Building a Translation Bot for Mobile Apps

TRUETECH is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Development and support of all types of mobile applications:

Information and entertainment mobile applications

News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators

E-commerce mobile applications

Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.

Business process management mobile applications

CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems

Electronic services mobile applications

Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Services we offer

Showing 1 of 1All 1734 services

Building a Translation Bot for Mobile Apps

Simple

~2-3 days

Frequently Asked Questions

Our competencies:

Free consultation

Book a free consultation if you have any questions. A dedicated specialist will advise you.

Cost calculation

If you know what exactly you need to develop, or you already have a ready-made technical task.

Development stages

Latest works

Development of a mobile application for FEEDME
860
Development of a mobile application for XOOMER
746
Development of a mobile application for RHL
1163
Development of a mobile application for ZIPPY
1035
Development of a mobile application for Affhome
970
Development of a mobile application for the FLAVORS company
563

Show more works

Technical Challenge: From Simple Translation to Context and Offline

A mobile translation bot seems like a simple API call. But when you need to account for dialogue context, work in the subway, and translate through the camera, a simple request to DeepL or Google won't cut it. We—a mobile development team with 5 years of experience—have faced this many times. Choosing the wrong architecture leads to user churn: translations become nonsensical, the app crashes offline, or requires constant internet. In this article, we'll show you how to build a reliable translator with offline mode, voice input, and camera, using modern APIs and ML Kit. Over the years we have delivered 50+ projects for clients across the US, Europe, and CIS, including specialized translators for medical and legal domains. Our experience shows that the key decisions—which Translation API and offline architecture you pick—shape the product quality. We'll go step by step: from API selection to camera integration, so you can make an informed choice and avoid common pitfalls.

Mobile Translation Bot API Selection: DeepL vs Google vs Yandex vs LLM

DeepL, Google Cloud Translation, Yandex Translate, and LLMs each have their strengths. The choice depends on the language pair and context requirements.

DeepL API offers the best quality for European languages—30% better than Google on some tests (measured by BLEU score). Free tier: 500K characters per month. Supports formal/informal tone (formality parameter). Important: DeepL does not support translation from Russian into some languages—in such cases, use Google or Yandex.

Google Cloud Translation API covers 100+ languages, high quality for Russian. v3 supports a glossary—a list of terms that must not be translated or must be translated in a specific way. For medicine, legal texts, brands—this is a must.

Yandex Translate API delivers the best results for Russian ↔ European languages. Built-in language detection.

LLM (GPT-4o / Claude) provides contextual translation with tone and style awareness. It excels on specialized texts, idioms, and humor. More expensive by 2-3x for large volumes. Recommended for dialogue bots.

API	Languages	Quality	Highlights
DeepL	30+ European	Excellent (30% better than Google)	formality, free 500K chars
Google	100+	High	glossary, autodetect
Yandex	100+	High for RU	built-in detect
LLM	Any	Contextual (20-30% improvement)	tone, idioms

Language Detection

The user inputs text—the bot needs to figure out the source language. Either a manual picker or auto-detection. Google Translation API returns detectedSourceLanguage. For short queries (1–2 words) auto-detection often fails—better to let the user choose manually.

Dialogue Context

If the user translates a series of related messages—an LLM with translation history gives more coherent results. Names, pronouns, and terms stay in context. To save tokens, pass the last N messages.

Terminology Glossary

Google Translation v3 Glossary API lets you create a list of terms that the model must not translate or must translate strictly:

from google.cloud import translate_v3

client = translate_v3.TranslationServiceClient()

glossary = client.create_glossary(
    parent=f"projects/{project_id}/locations/us-central1",
    glossary=translate_v3.Glossary(
        name=glossary_name,
        language_pair=translate_v3.Glossary.LanguageCodePair(
            source_language_code="en",
            target_language_code="ru"
        ),
        input_config=translate_v3.GlossaryInputConfig(
            gcs_source=translate_v3.GcsSource(input_uri=glossary_gcs_uri)
        )
    )
)

How to Implement Offline Translation?

For users in areas with unstable internet—translate on the device without sending text to the server. Data privacy is a key advantage. Offline models run locally—user text never leaves the device. They also speed up translation (10–50 ms vs. 200–500 ms for an API) and save up to 70% of costs under high load. For high-volume apps, offline mode can save thousands per month compared to cloud APIs. According to Google ML Kit documentation, language models weigh about 30 MB each.

iOS. Use MLKit Translation from Google. TranslateLanguage.allLanguages() lists available languages. Each model is ~30 MB. Before translating, check if the model is downloaded via translator.downloadModelIfNeeded(with:).

import MLKitTranslate

let options = TranslatorOptions(
    sourceLanguage: .russian,
    targetLanguage: .english
)
let translator = Translator.translator(options: options)

let conditions = ModelDownloadConditions(allowsCellularAccess: true)
translator.downloadModelIfNeeded(with: conditions) { error in
    guard error == nil else { return }
    translator.translate("Привет, мир") { result, error in
        print(result ?? "")
    }
}

Android. Same via TranslatorOptions from com.google.mlkit:translate. Offline models are quantized for efficiency—they run locally and never send data to the cloud.

Comparison: Online vs. Offline

Parameter	Online (API)	Offline (ML Kit)
Languages	100+	50+
Quality	Maximum	Slightly lower (5–10%)
Latency	200–500 ms	10–50 ms (10x faster)
Privacy	Text leaves device	On-device only
Internet required	Yes	No
Cost per 100K chars	$20	$0 (free)

Voice Input and Translation Playback

A logical addition: user speaks → translator reacts → reads aloud. STT for input: native APIs (Android SpeechRecognizer, iOS SFSpeechRecognizer) or Whisper. TTS for output: AVSpeechSynthesizer on iOS (AVSpeechSynthesisVoice(language: "fr-FR")) or TextToSpeech on Android. Important: verify that the desired voice is available on the device before speaking.

Camera: Real-time Translation

The most impressive scenario—point the camera at a menu, sign, or document and see the translation overlaid on the image. Technically: ML Kit TextRecognizer → translate blocks → render on top of the camera preview with OCR bounding boxes. A hidden pitfall: OCR text coordinates are tied to the frame, which changes 30 times per second. Stabilizing results (comparing bounding boxes via IoU from the previous frame) reduces flickering.

Our Work Process

Requirement analysis and API selection.
Backend development: API keys, caching, glossary.
Mobile UI: input, history, copy, share.
Integrate offline models, voice, camera (optional).
Testing and publishing (App Store / Google Play).

Deliverables

Architecture decision and stack selection document.
Backend and client-side source code.
Glossary configuration and contextual translation setup.
API integration documentation and user guides.
Post-launch support (1 month).
Training session for your team (1 day).
Access to deployment scripts and performance benchmarks.
User testing report with BLEU score evaluation.

Estimated Timelines and Pricing

A basic cloud-based translator: 2–3 days, from $1,000. With offline mode, voice input, and camera translation: 1.5–2 weeks, from $5,000. Full-featured app with all bells and whistles: up to $10,000. Accurate estimate after project analysis. We guarantee quality and adherence to deadlines—over 5 years of work, we have never missed a single deadline. Contact us to discuss your project. Order development of a mobile translation bot today.

Machine Learning in Mobile Apps: CoreML, TFLite, and On-Device Models

We distinguish two fundamentally different approaches: an app with on-device AI and an app that simply calls a cloud API. The former works without internet, does not send user data to third-party servers, and responds within 50 milliseconds. The latter depends on network latency and pricing plans. Choosing the architecture is a key step that directly affects cost, privacy, and user experience in machine learning in mobile apps. Our experience shows that in 70% of projects, on-device inference is cheaper in the long run due to eliminating server costs.

How to Choose Between CoreML and TFLite for On-Device Inference?

CoreML — Apple's native framework for running ML models on device. Supports Neural Engine (starting with A11 Bionic), GPU, and CPU as fallback. Models are converted to .mlmodel format via coremltools from PyTorch, ONNX, or TensorFlow. Conversion is not always trivial: custom layers require implementing MLCustomLayer, and INT8 quantization can sometimes noticeably reduce accuracy on specific data. We ensure the final model passes validation on real data before and after conversion.

TensorFlow Lite — cross-platform alternative for Android and Flutter. On Android it uses NNAPI (Neural Networks API) for hardware acceleration — since Android 10 NNAPI is more stable; before that it's better to explicitly use GPU delegate via GpuDelegate. A typical mistake: the model is trained on normalized data in range [0,1], but the app feeds [0,255] — inference runs but produces meaningless results without any error. We include an automatic input data validation module in the SDK.

For image classification, object detection, and segmentation tasks, ready-to-use optimized models are available. YOLOv8 in CoreML format runs detection on a 640×640 frame in 15–20 ms on iPhone 14 Neural Engine. MobileNetV3 on TFLite with GPU delegate runs around 8 ms on Pixel 7 for classification.

Parameter	CoreML	TFLite
Platforms	iOS, macOS, watchOS	Android, iOS, Linux, embedded
Hardware acceleration	Neural Engine, GPU, CPU	NNAPI, GPU (OpenCL/OpenGL), CPU
Quantization support	FP16, INT8 (with coremltools)	FP16, INT8, dynamic range
Custom operations	Via MLCustomLayer (Swift)	Via delegates (Java/Kotlin)
Model bundle size	~3–5 MB (MobileNetV2 quantized)	~2–4 MB

What If You Need Text Generation On-Device?

Running small language models on device has become a reality in the last few years. Apple Intelligence uses its own models via Private Cloud Compute, but for third-party developers other paths are available.

llama.cpp with Metal backend on iOS is a working approach for phi-3-mini (3.8B parameters, 4-bit quantization, ~2.3 GB). Inference: 15–25 tokens/second on iPhone 15 Pro. For integration in Swift, use the Swift Package llama.swift or a wrapper via C interface llama.h. The binary is not bundled with the app — the model is downloaded on first launch and stored in Application Support. Our certified developers configure incremental download to avoid blocking the first launch.

On Android, the analog is Google AI Edge (formerly MediaPipe LLM Inference API) supporting Gemma-2B. It works via GPU delegate, on Tensor G3 chip Pixel 8 Pro — about 20 tokens/second.

Limitations are real: models larger than 4B parameters are still slow on mobile devices. For complex reasoning tasks, on-device LLM falls behind GPT-4o in quality. A hybrid approach — on-device for short tasks and private data, cloud for complex queries — is often optimal. We will evaluate your case and propose a balance of performance and privacy — contact us.

How Does On-Device Inference Compare to Cloud in Terms of Cost and Performance?

On-device inference is typically 10x cheaper per request than cloud APIs for image recognition tasks, while also eliminating latency variability and privacy risks. The table below summarizes the trade-offs.

Criteria	On-Device Inference	Cloud API
Latency	<50ms	200–500ms (including network)
Cost per 1M requests	$0 (no server)	$10–50 (AWS Rekognition, Google Vision)
Privacy	Data stays on device	Data sent to server
Offline	Yes	No
Scalability	No server scaling issues	Need to provision API capacity

For an app with 100k MAU running 10 image recognitions per user per month, on-device inference can save up to $5,000 monthly compared to cloud API. Get a free consultation on your ML architecture today.

Integrating OpenAI API and Other Cloud Models

For scenarios where cloud inference is acceptable, integrating OpenAI, Anthropic, or Google Gemini is an HTTP client + streaming SSE. In Swift, AsyncThrowingStream is convenient for streaming responses. In Kotlin, use Flow.

Critically: API keys must never be stored in the app bundle. Even an obfuscated key can be extracted from the IPA in 10 minutes using strings or frida. Correct architecture: mobile app → your own backend → OpenAI API. The backend controls rate limiting, logs requests, and protects the key.

What Is Included in the Work (Deliverables)

Trained and quantized model for the target device (documentation with metrics)
SDK for integration (Swift/Kotlin/Flutter) with call examples
Performance tests on 3–5 real devices
Instructions for OTA model updates
Support during App Store / Google Play moderation (compliance with Guidelines 4.2, 5.1)
2 weeks of technical support after release

Typical Project Pipeline

Task analysis — measure latency, privacy, size, supported devices.
Model prototyping — in Python, evaluate accuracy on target data.
Conversion and quantization — for CoreML/TFLite with validation.
Integration into the app — model wrapped in a service layer (easy to swap CoreML ↔ TFLite ↔ cloud).
Testing — on real devices, measure FPS, RAM, battery.
Deployment — via TestFlight / Firebase App Distribution, monitor metrics.

Timelines: integration of a ready CoreML/TFLite model — 1–2 weeks, development of a custom model with mobile optimization — from 6 weeks, on-device LLM chat with personalization — 4–8 weeks.

Why We Take on Complex Cases?

10+ years of experience in mobile development, 50+ implemented AI/ML solutions, guarantee of compatibility with current iOS and Android versions. All projects undergo code review and load testing. The cost includes preparation of moderation documentation and training of your team.

Contact us — we will help you choose the architecture and implement ML in your app turnkey. Order an audit of your existing solution — we will assess the potential for server cost savings free of charge. In some projects, savings can reach significant amounts per month.