Parsing Data from Excel/CSV for Import into 1C-Bitrix
Standard 1C-Bitrix import via CSV requires a strictly defined file structure: specific column headers, specific encoding, specific delimiter. Files from content managers, 1C, suppliers, and marketers don't meet these requirements. The parsing task is to take arbitrary Excel or CSV and convert it to a structure suitable for catalog import.
Standard Bitrix Import and Its Limitations
Built-in product import: Catalog → Import from CSV. Works with UTF-8 or Windows-1251 files, expects headers like NAME, PRICE, QUANTITY, XML_ID, PROPERTY_ARTICUL.
Problem: real files have arbitrary headers ("Name", "Price without VAT", "Art."), multiple header rows, merged cells, summary sheets, and decorative elements at the beginning.
Solution: intermediate parser that converts the input file to a format Bitrix accepts — or directly writes data via API.
PhpSpreadsheet Library
PhpSpreadsheet is the standard tool for reading Excel in PHP:
use PhpOffice\PhpSpreadsheet\IOFactory;
$spreadsheet = IOFactory::load('/path/to/file.xlsx');
$sheet = $spreadsheet->getActiveSheet();
$rows = $sheet->toArray();
For large files (> 10,000 rows), use ReadFilter mode — read only needed columns and rows, without loading the entire file into memory.
Excel quirks that break naive parsing:
- Formatted numbers (1 000,50 instead of 1000.50) — remove spaces, replace comma with period.
- Dates stored as numbers (Excel serial dates) — need conversion.
- Merged cells — value only in first cell of range.
- Hidden rows and sheets — ignore.
Column Mapping: Configuration Approach
Instead of hardcoding "column A is name," create mapping in config:
$mapping = [
'NAME' => 'Product Name', // header in file
'XML_ID' => 'Article',
'PRICE' => 'Price without VAT',
'QUANTITY' => 'Stock',
'SECTION' => 'Category',
];
Parser finds header row (automatically or by row number), matches with mapping, and reads data.
Auto-detect header row: scan first 10 rows, find the one with most matches to mapping keys.
Data Transformation
Data from Excel is rarely ready for direct import. Typical transformations:
| Transformation | Example |
|---|---|
| Trim | " Product name " → "Product name" |
| Number conversion | "1 250,00 rub." → 1250.00 |
| Unit normalization | "pcs.", "pcs", "Piece" → unified unit ID |
| Transliteration for CODE | "Ground Coffee" → "ground-coffee" |
| Truncate to limit | NAME to 255 characters |
| Empty values | NULL or empty string → default value |
Importing into Catalog via API
Two approaches to writing data into Bitrix:
1. Via CIBlockElement (classic API):
$el = new CIBlockElement();
$result = $el->Add([
'IBLOCK_ID' => $iblockId,
'NAME' => $row['NAME'],
'CODE' => $row['CODE'],
'XML_ID' => $row['XML_ID'],
'ACTIVE' => 'Y',
'PROPERTY_VALUES' => [
'ARTICUL' => $row['XML_ID'],
],
]);
2. Via D7 ORM (\Bitrix\Iblock\ElementTable) — read-only; writing is complex, for products use CIBlockElement.
To update existing products — first find by XML_ID, then $el->Update($id, $fields).
CSV: Typical Issues
Encoding. Files from 1C come in Windows-1251. Before processing: iconv('Windows-1251', 'UTF-8', $content).
Delimiter. Auto-detect: count commas and semicolons in first row — more frequent one is delimiter.
Line breaks within cell. In CSV, value with break is quoted. fgetcsv() handles correctly, but only if file has no quote encoding problems.
Development Timeline
| Option | Scope | Timeline |
|---|---|---|
| One-time import | Script for single file | 1 day |
| Configurable importer | Mapping via config, multiple formats | 3–4 days |
| Complete system | UI for file upload, preview before import, history | 6–8 days |







