Automatically populating product characteristics from external 1C-Bitrix sources

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1177
  • image_bitrix-bitrix-24-1c_fixper_448_0.png
    Website development for FIXPER company
    811
  • image_bitrix-bitrix-24-1c_development_of_an_online_appointment_booking_widget_for_a_medical_center_594_0.webp
    Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
    564
  • image_bitrix-bitrix-24-1c_mirsanbel_458_0.webp
    Development based on 1C Enterprise for MIRSANBEL
    747
  • image_crm_dolbimby_434_0.webp
    Website development on CRM Bitrix24 for DOLBIMBY
    655
  • image_crm_technotorgcomplex_453_0.webp
    Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
    976

Auto-Filling Product Attributes from External Sources in 1C-Bitrix

Correctly populated attributes are the foundation of faceted filtering. Without them, the bitrix:catalog.smart.filter component does not work and shoppers cannot filter by product parameters. Automating attribute population is more complex than automating descriptions: the info block property schema must be maintained, heterogeneous data must be normalized, and consistency of enum property values must be enforced.

Designing the Property Schema for Automation

Before starting — audit the existing info block properties. Common issues in legacy catalogs:

  • Attributes stored in a single text property "Description" separated by line breaks
  • The same attribute created multiple times with different CODEs
  • Numeric values stored in string properties, causing the filter to malfunction

A clean schema is required for auto-filling: each attribute is a separate property with the correct type. Numbers — type N, enumerated values (brand, color, material) — type L (list), text — type S.

Attribute Sources

Icecat XML — the most complete source for electronics and home appliances. Search by EAN via https://icecat.us/api/. Data is structured and attribute names are standardized.

GS1 / GEPIR — a barcode database containing basic product attributes.

Manufacturer API — if the manufacturer provides partner access to specifications.

Manufacturer website scraping — fallback when no API is available. Specification tables are parsed as described in the article on attribute scraping.

Normalization and Quality Control

The main challenge: the same parameter from different sources may have different names and different units of measurement.

Normalization architecture:

  1. Canonical name dictionaryproperty_canonical_map table:
    source: 'icecat', source_name: 'Screen Size', canonical: 'display_diagonal', unit: 'inch'
    source: 'supplier_a', source_name: 'Diagonal', canonical: 'display_diagonal', unit: 'cm'
    
  2. Unit converter — automatically converts cm to inches (or vice versa) by canonical at import time
  3. Range validator — numeric values are checked for plausibility (a phone weighing 5000 g is clearly an error)

Managing Enum Values for List Properties

Type L properties require values to be created in b_iblock_prop_enum in advance. During auto-filling, new values appear constantly — a strategy is needed:

Auto-create — a new value is automatically added to the enum. Risk: garbage values from parsing errors (typos, HTML tags in the value).

Moderation queue — a new value is placed in a queue; a manager approves or rejects it. Safer, but requires attention.

Recommended approach: auto-create with filtering (length > 2 and < 100 characters, no HTML, no special characters) + notification to the manager about new values.

Incremental Updates

Attributes change less frequently than prices — weekly updates are sufficient for most catalogs. Optimization:

  • Store a hash of an element's attribute set: md5(serialize($properties))
  • When updating from a source, compare the hash — if unchanged, skip the DB write
  • This reduces load on b_iblock_element_property at high volumes

Project Timeline

Phase Duration
Audit and restructuring of the info block property schema 1–2 days
Source provider development 2–4 days
Normalizer, canonical name dictionary 2–3 days
Enum value management 1 day
Incremental updates, monitoring 1–2 days

Total: 7–12 working days — one of the most labor-intensive auto-filling tasks due to the need to work with the data schema.