Machine translation integration for 1C-Bitrix

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1175
  • image_bitrix-bitrix-24-1c_fixper_448_0.png
    Website development for FIXPER company
    811
  • image_bitrix-bitrix-24-1c_development_of_an_online_appointment_booking_widget_for_a_medical_center_594_0.webp
    Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
    564
  • image_bitrix-bitrix-24-1c_mirsanbel_458_0.webp
    Development based on 1C Enterprise for MIRSANBEL
    747
  • image_crm_dolbimby_434_0.webp
    Website development on CRM Bitrix24 for DOLBIMBY
    655
  • image_crm_technotorgcomplex_453_0.webp
    Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
    976

Machine Translation Content Integration for 1C-Bitrix

Translating a 20,000-product catalog into three languages using human translators takes months of work and a substantial budget. Machine translation covers 80–90% of the volume in hours, leaving translators only to proofread the most critical content: the homepage, landing pages, and legal documents. But integrating an MT service with 1C-Bitrix is not simply a matter of "calling an API" — there are non-trivial technical challenges involved.

Choosing an MT Service: What Matters for 1C-Bitrix

The main candidates are DeepL, Google Cloud Translation, and Yandex Translate. All three provide a REST API, but differ in quality for specific language pairs and in how they handle the particular challenges of e-commerce content:

DeepL — best quality for European languages (DE, FR, EN, PL). API v2 supports the text/html format — it translates text while preserving HTML tags. This is critical for product descriptions that contain markup.

Google Cloud Translation — broad language coverage (200+), including CIS languages (KZ, UZ, AZ). The base model (translate/v2) is cheaper; the NMT model (translate/v3) is more accurate.

Yandex Translate — optimal for the RU→KZ and RU→BE language pairs. API v2 supports batched submission of up to 10,000 characters per request.

Problem #1: HTML in Product Descriptions

Descriptions in the b_iblock_element field DETAIL_TEXT often contain HTML markup. If such text is submitted to an MT API as plain text, the tags will be translated literally: <strong> becomes <сильный> in Yandex (this is a real-world case when the format is not specified).

Solution using Google Translation v3:

$client = new TranslationServiceClient();
$response = $client->translateText([
    'parent' => 'projects/{project}/locations/global',
    'contents' => [$htmlContent],
    'mimeType' => 'text/html',  // Critical!
    'sourceLanguageCode' => 'ru',
    'targetLanguageCode' => 'en',
]);

DeepL: use the tag_handling=html parameter in the request. With this flag, DeepL translates only text nodes, leaving tag attributes and markup structure untouched.

Problem #2: Variables and Shortcodes in Content

Product descriptions may contain internal substitution patterns: {SIZE_GUIDE}, [product_id=123], <!--#include file="..."-->. An MT service may "translate" these constructs, breaking their functionality.

Solution: before submitting to the MT service, replace all such constructs with placeholders that the MT service will not modify. For DeepL, use ignore_tags or wrap constructs in <keep> tags. After translation, perform the reverse substitution.

$placeholders = [];
$pattern = '/\{[A-Z_]+\}|\[product_id=\d+\]/';
$text = preg_replace_callback($pattern, function ($match) use (&$placeholders) {
    $key = 'PLACEHOLDER_' . count($placeholders);
    $placeholders[$key] = $match[0];
    return $key;
}, $originalText);

// Translate $text...

// Restore placeholders
foreach ($placeholders as $key => $value) {
    $translatedText = str_replace($key, $value, $translatedText);
}

Batch Processing and API Limits

For a catalog of 20,000 items, translating one product per API request is both slow and expensive. Batch processing limits:

  • Google: up to 1,024 strings per request
  • DeepL: up to 50 texts per request
  • Yandex: up to 10,000 characters total per request

Implement a translation queue: each product is a task in the queue. A worker picks up a batch, submits it to the MT API, and saves the result. On error — retry with exponential backoff.

Translations are saved to element language versions (b_iblock_element_lang) via CIBlockElement::SetPropertyValues() with the LANGUAGE_ID specified. After saving, invalidate the element cache.

Post-Editing: Flags for Translators

Machine translation is not the final step. Translators need the ability to mark a translation as "requires review" or "manually edited". We add an infoblock property MT_STATUS (list: auto, reviewed, manual) for each language. Translators see only elements with auto status — they do not need to sift through the entire catalog.

Estimated Timelines

Scenario Timeline
MT API integration, batch translation of names and descriptions 2–4 weeks
+ HTML handling, placeholders, queue with retries 4–6 weeks
+ post-editing interface in the 1C-Bitrix admin panel +2–3 weeks

Pricing is calculated individually. Factors include: content volume, number of languages, chosen MT service, and quality requirements.