Automatically populating product images from external 1C-Bitrix sources

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1173
  • image_bitrix-bitrix-24-1c_fixper_448_0.png
    Website development for FIXPER company
    811
  • image_bitrix-bitrix-24-1c_development_of_an_online_appointment_booking_widget_for_a_medical_center_594_0.webp
    Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
    564
  • image_bitrix-bitrix-24-1c_mirsanbel_458_0.webp
    Development based on 1C Enterprise for MIRSANBEL
    745
  • image_crm_dolbimby_434_0.webp
    Website development on CRM Bitrix24 for DOLBIMBY
    655
  • image_crm_technotorgcomplex_453_0.webp
    Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
    976

Auto-Filling Product Images from External Sources in 1C-Bitrix

Images are the heaviest element of catalog auto-filling by volume. 10,000 products × 5 photos = 50,000 files that need to be downloaded, validated, optimized, and correctly attached to the info block. The system must run in the background, avoid overloading the server during peak hours, and not duplicate already-uploaded photos.

Image Sources

Manufacturer API — the cleanest option. Manufacturers often provide media libraries to partners: a ZIP archive of photos indexed by SKU, or an API for downloading by EAN.

Icecat / Syndigo — a manufacturer content database. Paid access, but fully legal with good coverage of electronics and home appliances.

Manufacturer website — scraping. Priority: look for <meta property="og:image"> tags or JSON-LD image fields — these often point to high-quality images without having to parse the full gallery.

Supplier YML feed — the <picture> tag contains the URL of the main product photo.

Deduplication: Do Not Download the Same File Twice

Store a cache of downloaded URLs in a Highload block or a table:

CREATE TABLE image_download_cache (
    source_url TEXT PRIMARY KEY,
    file_id INT,  -- ID in b_file
    downloaded_at TIMESTAMP
);

Before downloading, check the cache — if the URL has already been processed and the file exists, use the existing file_id.

Image Quality Validation

Not every image found is usable. Required checks before saving:

$imageInfo = getimagesizefromstring($imageData);
// minimum resolution
if ($imageInfo[0] < 400 || $imageInfo[1] < 400) return null;
// MIME type check
if (!in_array($imageInfo['mime'], ['image/jpeg', 'image/png', 'image/webp'])) return null;
// minimum file size (too small = placeholder or error)
if (strlen($imageData) < 10_000) return null;

Optimizing Images Before Saving

Downloaded photos are often larger than needed (3000×3000 px, 5 MB). Before saving to 1C-Bitrix:

  • Resize to a maximum of 1500 px on the longer side (for detail_picture)
  • Convert CMYK → RGB (a common issue with photos from print sources)
  • JPEG compression to quality 85

Using Intervention Image (a wrapper around GD/Imagick):

$image = Image::make($imageData)->resize(1500, null, fn($c) => $c->aspectRatio());
$optimized = $image->encode('jpg', 85)->getEncoded();

1C-Bitrix generates thumbnails via its own resize mechanism (CFile::ResizeImageGet), but it is better to provide an already-optimized original.

Background Processing and Queues

50,000 images cannot be processed in a single run. Architecture:

  • Worker 1: scans the info block, finds elements without images → adds them to the queue
  • Workers 2–5: download and save images in parallel (4 threads)
  • Schedule: workers run overnight 02:00–06:00 to avoid server load during the day
  • Session limit: no more than 1,000 images per run

Project Timeline

Phase Duration
Downloader development with validation and optimization 2–3 days
Deduplication and caching system 4–8 hours
Queues, parallel workers 1–2 days
Info block attachment (preview + gallery) 4–6 hours
Progress monitoring, admin interface 1 day

Total: 6–9 working days. Initial population of 10,000 products at 4 threads takes approximately 3–4 hours.