Optimizing Elasticsearch Reindexing for 1C-Bitrix

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1175
  • image_bitrix-bitrix-24-1c_fixper_448_0.png
    Website development for FIXPER company
    811
  • image_bitrix-bitrix-24-1c_development_of_an_online_appointment_booking_widget_for_a_medical_center_594_0.webp
    Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
    564
  • image_bitrix-bitrix-24-1c_mirsanbel_458_0.webp
    Development based on 1C Enterprise for MIRSANBEL
    747
  • image_crm_dolbimby_434_0.webp
    Website development on CRM Bitrix24 for DOLBIMBY
    655
  • image_crm_technotorgcomplex_453_0.webp
    Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
    976

Optimizing Elasticsearch Re-indexing for 1C-Bitrix

Optimizing Elasticsearch Re-indexing for 1C-Bitrix

On a project with 800,000 SKUs, the scheduled re-indexing from 1C took 14 hours. During that time, a queue of changes accumulated, search returned stale data, and the 1C import competed with re-indexing for resources. The goal: reduce a full re-index to 1–2 hours without stopping search.

Why Re-indexing Is Slow

Typical causes of slow re-indexing:

Sequential indexing via individual requests. Each document is sent as a separate PUT /bitrix_catalog/_doc/{id}. HTTP overhead, a TCP connection per request, 200ms response × 800,000 = 160,000 seconds.

Batch size in the Bulk API is too small. A batch size of 10 documents instead of the optimal 200–1,000.

Refresh after every batch. Forcing a POST /bitrix_catalog/_refresh after each batch is a performance killer. Refresh creates a new Lucene segment and blocks indexing for tens of milliseconds.

Incorrect refresh_interval during re-indexing. The default 1 second generates hundreds of thousands of small segments, and ES spends resources merging them.

Zero-Downtime Re-indexing Strategy

The key technique is indexing into a new index rather than overwriting the active one:

Current: bitrix_catalog_v1  <-- alias bitrix_catalog
New:     bitrix_catalog_v2  <-- index here
After:   switch alias to v2
// Create alias during initial setup
POST /_aliases
{
  "actions": [
    { "add": { "index": "bitrix_catalog_v1", "alias": "bitrix_catalog" } }
  ]
}

Bitrix works with the bitrix_catalog alias. While re-indexing proceeds into v2, search continues to work through v1. After completion, atomically switch the alias.

Configuring Elasticsearch for Fast Indexing

Before re-indexing, temporarily change settings via the API:

PUT /bitrix_catalog_v2/_settings
{
  "index": {
    "refresh_interval": "-1",
    "number_of_replicas": 0,
    "translog.durability": "async",
    "translog.sync_interval": "30s"
  }
}

refresh_interval: -1 completely disables automatic refresh — documents are not available in search until an explicit refresh, but indexing speeds up 3–5x.

number_of_replicas: 0 disables replication during indexing. ES does not spend time copying shards.

translog.durability: async — the translog is flushed to disk on a timer rather than on every operation. The risk of losing the last 30 seconds of data on failure is acceptable for re-indexing, but not for production data.

After completion, restore settings:

PUT /bitrix_catalog_v2/_settings
{
  "index": {
    "refresh_interval": "5s",
    "number_of_replicas": 1,
    "translog.durability": "request"
  }
}

Then force merge to reduce segment count:

POST /bitrix_catalog_v2/_forcemerge?max_num_segments=1

Optimizing the PHP Indexer

Determine the Bulk API batch size empirically. Target 5–15 MB per request:

$batchSize = 500; // documents per bulk request
$bulk = [];

foreach ($products as $product) {
    $bulk[] = ['index' => ['_index' => 'bitrix_catalog_v2', '_id' => $product['ID']]];
    $bulk[] = buildDocument($product);

    if (count($bulk) >= $batchSize * 2) {
        $client->bulk(['body' => $bulk]);
        $bulk = [];
        // Do NOT call _refresh here
    }
}

if (!empty($bulk)) {
    $client->bulk(['body' => $bulk]);
}

Parallel indexing via multiple processes — split the catalog by ID ranges:

# Process 1: IDs 1 - 200000
php index_products.php --from=1 --to=200000 &

# Process 2: IDs 200001 - 400000
php index_products.php --from=200001 --to=400000 &

# Process 3: IDs 400001 - 600000
php index_products.php --from=400001 --to=600000 &

wait
echo "Indexing complete"

Three parallel processes on a 3-node cluster provide linear speed improvement. The bottleneck is MySQL read throughput.

Switching the Alias

After indexing completes, atomically switch the alias:

POST /_aliases
{
  "actions": [
    { "remove": { "index": "bitrix_catalog_v1", "alias": "bitrix_catalog" } },
    { "add":    { "index": "bitrix_catalog_v2", "alias": "bitrix_catalog" } }
  ]
}

The operation is atomic — there is no moment between removing the old alias and adding the new one when the alias does not exist. Switching takes milliseconds, search is not interrupted.

Result

On a catalog of 800K SKUs after optimization: full re-index takes 1 hour 20 minutes (was 14 hours), incremental updates process 200–300 documents per second via Bulk API instead of 15–20 via individual requests.