Auto-Filling Product Attributes from External Sources in 1C-Bitrix
Correctly populated attributes are the foundation of faceted filtering. Without them, the bitrix:catalog.smart.filter component does not work and shoppers cannot filter by product parameters. Automating attribute population is more complex than automating descriptions: the info block property schema must be maintained, heterogeneous data must be normalized, and consistency of enum property values must be enforced.
Designing the Property Schema for Automation
Before starting — audit the existing info block properties. Common issues in legacy catalogs:
- Attributes stored in a single text property "Description" separated by line breaks
- The same attribute created multiple times with different CODEs
- Numeric values stored in string properties, causing the filter to malfunction
A clean schema is required for auto-filling: each attribute is a separate property with the correct type. Numbers — type N, enumerated values (brand, color, material) — type L (list), text — type S.
Attribute Sources
Icecat XML — the most complete source for electronics and home appliances. Search by EAN via https://icecat.us/api/. Data is structured and attribute names are standardized.
GS1 / GEPIR — a barcode database containing basic product attributes.
Manufacturer API — if the manufacturer provides partner access to specifications.
Manufacturer website scraping — fallback when no API is available. Specification tables are parsed as described in the article on attribute scraping.
Normalization and Quality Control
The main challenge: the same parameter from different sources may have different names and different units of measurement.
Normalization architecture:
-
Canonical name dictionary —
property_canonical_maptable:source: 'icecat', source_name: 'Screen Size', canonical: 'display_diagonal', unit: 'inch' source: 'supplier_a', source_name: 'Diagonal', canonical: 'display_diagonal', unit: 'cm' -
Unit converter — automatically converts cm to inches (or vice versa) by
canonicalat import time - Range validator — numeric values are checked for plausibility (a phone weighing 5000 g is clearly an error)
Managing Enum Values for List Properties
Type L properties require values to be created in b_iblock_prop_enum in advance. During auto-filling, new values appear constantly — a strategy is needed:
Auto-create — a new value is automatically added to the enum. Risk: garbage values from parsing errors (typos, HTML tags in the value).
Moderation queue — a new value is placed in a queue; a manager approves or rejects it. Safer, but requires attention.
Recommended approach: auto-create with filtering (length > 2 and < 100 characters, no HTML, no special characters) + notification to the manager about new values.
Incremental Updates
Attributes change less frequently than prices — weekly updates are sufficient for most catalogs. Optimization:
- Store a hash of an element's attribute set:
md5(serialize($properties)) - When updating from a source, compare the hash — if unchanged, skip the DB write
- This reduces load on
b_iblock_element_propertyat high volumes
Project Timeline
| Phase | Duration |
|---|---|
| Audit and restructuring of the info block property schema | 1–2 days |
| Source provider development | 2–4 days |
| Normalizer, canonical name dictionary | 2–3 days |
| Enum value management | 1 day |
| Incremental updates, monitoring | 1–2 days |
Total: 7–12 working days — one of the most labor-intensive auto-filling tasks due to the need to work with the data schema.







