Optimizing Search Indexing in 1C-Bitrix
Optimizing Search Indexing for the Built-in Bitrix Module
The built-in Bitrix search operates through the b_search_content, b_search_content_stem, and b_search_stem tables. On a catalog of 30,000+ products, a full re-index takes 4–12 hours, the CSearchIndex::IndexAgent agent runs continuously on the server, and search quality remains mediocre: stemming cannot handle morphology properly, and relevance does not account for popularity or sales data.
The task of optimizing search indexing is addressed at two levels: tactical (speed up and stabilize the existing mechanism) and strategic (migrate to Elasticsearch when full-text search with facets is required).
Diagnosing the Current Indexing State
Examine the state of the index tables:
-- Search table sizes
SELECT
table_name,
ROUND(data_length / 1024 / 1024, 2) AS data_mb,
ROUND(index_length / 1024 / 1024, 2) AS index_mb,
table_rows
FROM information_schema.TABLES
WHERE table_schema = DATABASE()
AND table_name LIKE 'b_search%'
ORDER BY data_length DESC;
The b_search_content_stem table on a large portal can occupy 2–5 GB. Running OPTIMIZE TABLE on it takes hours and blocks other queries — only do this during a maintenance window.
Check the progress of the current indexing:
SELECT SITE_ID, MODULE_ID, PARAM1,
COUNT(*) as indexed_count,
MAX(DATE_CHANGE) as last_indexed
FROM b_search_content
GROUP BY SITE_ID, MODULE_ID, PARAM1;
Optimizing Indexing Parameters
In the admin interface (Settings → Search → Settings):
Minimum word length: increase from 2 to 3–4. Two-letter words pollute the index and carry no search value in the vast majority of queries.
Stop words: add prepositions, conjunctions, and articles. For a Russian-language site, the basic list includes 50–80 words.
Indexed modules: disable indexing for modules whose content users do not need to search (forums if closed; blogs; document management). Each unnecessary module in the index increases table size and re-indexing time.
Number of elements per agent pass:
// In search module settings or via agent
CSearch::ReIndex($moduleId, $start, $finish, $step = 50);
The optimal step value is 50–100 elements. More — the agent runs longer than a single execution window and may be interrupted by PHP timeout. Less — too much overhead per launch.
Incremental vs. Full Indexing
Full re-indexing (ReIndexAll) — only for initial setup or after major catalog structure changes. In normal operation, incremental indexing via the agent should be active.
The CSearchIndex::IndexAgent agent works incrementally — it indexes elements modified since the last indexing run. The problem arises when DATE_CHANGE on elements is updated with every 1C synchronization (even when content has not changed). In this case, the agent effectively re-indexes the entire catalog on every sync.
Solution: in the 1C import handler, compare a hash of the significant fields before and after the update, and only update DATE_CHANGE when content has actually changed:
$oldHash = md5($oldElement['NAME'] . $oldElement['DETAIL_TEXT'] . implode(',', $oldProperties));
$newHash = md5($newElement['NAME'] . $newElement['DETAIL_TEXT'] . implode(',', $newProperties));
if ($oldHash !== $newHash) {
// Update with DATE_CHANGE modification
} else {
// Update stock/prices without triggering re-indexing
$DB->Query("UPDATE b_iblock_element SET TIMESTAMP_X = TIMESTAMP_X WHERE ID = {$id}");
}
Configuring MySQL FULLTEXT Indexes
The built-in Bitrix search uses MySQL FULLTEXT indexes on the b_search_content table:
# In my.cnf for InnoDB FULLTEXT
innodb_ft_min_token_size = 3
innodb_ft_stopword_table = 'mydb/my_stopwords'
ft_query_expansion_limit = 20
After changing ft_min_token_size, a full FULLTEXT re-index is required (OPTIMIZE TABLE b_search_content) — schedule this during a maintenance window.
Cleaning Up Stale Index Records
On live sites, b_search_content accumulates records for deleted elements — "garbage" that increases index size and slows down search:
-- Find records for deleted iblock elements
SELECT sc.ID, sc.PARAM1, sc.PARAM2
FROM b_search_content sc
LEFT JOIN b_iblock_element ie ON ie.ID = CAST(sc.PARAM1 AS UNSIGNED)
WHERE sc.MODULE_ID = 'iblock'
AND ie.ID IS NULL
LIMIT 10000;
Delete in batches of 1,000 records via an agent or cron script — never run a single DELETE on 100,000 records in production.
Monitoring Search Quality
After optimization, configure logging of search queries that return zero results:
SELECT QUERY, SITE_ID, RESULT_COUNT, DATE_SEARCH
FROM b_search_log
WHERE RESULT_COUNT = 0
ORDER BY DATE_SEARCH DESC
LIMIT 100;
Queries with zero results are a signal to expand content, add synonyms, or migrate to morphological search.
Migrating to Elasticsearch
If the built-in search cannot meet quality or performance requirements — migrate to Elasticsearch via the intec.search module or a custom integration. This is a separate service, but diagnostics and recommendations are provided as part of this engagement.
Result
Optimizing indexing parameters and the incremental indexing strategy reduces full catalog traversal time by 50–80%, brings agent load down to a level that does not impact primary requests, and improves search result quality through stop-word configuration and minimum token length tuning.







