Machine Translation Content Integration for 1C-Bitrix
Translating a 20,000-product catalog into three languages using human translators takes months of work and a substantial budget. Machine translation covers 80–90% of the volume in hours, leaving translators only to proofread the most critical content: the homepage, landing pages, and legal documents. But integrating an MT service with 1C-Bitrix is not simply a matter of "calling an API" — there are non-trivial technical challenges involved.
Choosing an MT Service: What Matters for 1C-Bitrix
The main candidates are DeepL, Google Cloud Translation, and Yandex Translate. All three provide a REST API, but differ in quality for specific language pairs and in how they handle the particular challenges of e-commerce content:
DeepL — best quality for European languages (DE, FR, EN, PL). API v2 supports the text/html format — it translates text while preserving HTML tags. This is critical for product descriptions that contain markup.
Google Cloud Translation — broad language coverage (200+), including CIS languages (KZ, UZ, AZ). The base model (translate/v2) is cheaper; the NMT model (translate/v3) is more accurate.
Yandex Translate — optimal for the RU→KZ and RU→BE language pairs. API v2 supports batched submission of up to 10,000 characters per request.
Problem #1: HTML in Product Descriptions
Descriptions in the b_iblock_element field DETAIL_TEXT often contain HTML markup. If such text is submitted to an MT API as plain text, the tags will be translated literally: <strong> becomes <сильный> in Yandex (this is a real-world case when the format is not specified).
Solution using Google Translation v3:
$client = new TranslationServiceClient();
$response = $client->translateText([
'parent' => 'projects/{project}/locations/global',
'contents' => [$htmlContent],
'mimeType' => 'text/html', // Critical!
'sourceLanguageCode' => 'ru',
'targetLanguageCode' => 'en',
]);
DeepL: use the tag_handling=html parameter in the request. With this flag, DeepL translates only text nodes, leaving tag attributes and markup structure untouched.
Problem #2: Variables and Shortcodes in Content
Product descriptions may contain internal substitution patterns: {SIZE_GUIDE}, [product_id=123], <!--#include file="..."-->. An MT service may "translate" these constructs, breaking their functionality.
Solution: before submitting to the MT service, replace all such constructs with placeholders that the MT service will not modify. For DeepL, use ignore_tags or wrap constructs in <keep> tags. After translation, perform the reverse substitution.
$placeholders = [];
$pattern = '/\{[A-Z_]+\}|\[product_id=\d+\]/';
$text = preg_replace_callback($pattern, function ($match) use (&$placeholders) {
$key = 'PLACEHOLDER_' . count($placeholders);
$placeholders[$key] = $match[0];
return $key;
}, $originalText);
// Translate $text...
// Restore placeholders
foreach ($placeholders as $key => $value) {
$translatedText = str_replace($key, $value, $translatedText);
}
Batch Processing and API Limits
For a catalog of 20,000 items, translating one product per API request is both slow and expensive. Batch processing limits:
- Google: up to 1,024 strings per request
- DeepL: up to 50 texts per request
- Yandex: up to 10,000 characters total per request
Implement a translation queue: each product is a task in the queue. A worker picks up a batch, submits it to the MT API, and saves the result. On error — retry with exponential backoff.
Translations are saved to element language versions (b_iblock_element_lang) via CIBlockElement::SetPropertyValues() with the LANGUAGE_ID specified. After saving, invalidate the element cache.
Post-Editing: Flags for Translators
Machine translation is not the final step. Translators need the ability to mark a translation as "requires review" or "manually edited". We add an infoblock property MT_STATUS (list: auto, reviewed, manual) for each language. Translators see only elements with auto status — they do not need to sift through the entire catalog.
Estimated Timelines
| Scenario | Timeline |
|---|---|
| MT API integration, batch translation of names and descriptions | 2–4 weeks |
| + HTML handling, placeholders, queue with retries | 4–6 weeks |
| + post-editing interface in the 1C-Bitrix admin panel | +2–3 weeks |
Pricing is calculated individually. Factors include: content volume, number of languages, chosen MT service, and quality requirements.







