Configuring Data Deduplication in Bitrix24 CRM
Duplicate contacts, companies, and deals are a chronic problem in any CRM that has been in use for more than six months. Managers create cards manually, leads come in from multiple sources, and Excel imports overlap with the existing database. The result: the sales team loses conversation context, analytics become unreliable, and automations fire twice.
Bitrix24 provides a built-in deduplication mechanism, but out of the box it only works on exact field matches. To make it genuinely useful, it needs to be configured for the specific database.
How Deduplication Works in Bitrix24
The system compares a new record against existing ones using a set of match criteria: phone number, email, company name, and tax ID. The logic is implemented in the crm module via the CCrmEntityMerger class and the REST method crm.duplicate.findbycomm.
Duplicate detection is triggered:
- when creating a card manually — a warning popup appears;
- during import — the "Check for duplicates" flag in the import wizard;
- via REST during integrations — the method returns an array of potential duplicates before saving.
Configuring match criteria: CRM → Settings → Duplicates. This is where you define which fields the system uses, and the weight assigned to each.
Configuring Match Criteria
The key point: do not enable every available criterion. In practice, more criteria means more false positives and slower checks on large databases (50k+ records).
Optimal configuration for most B2B companies:
| Entity | Field | Priority |
|---|---|---|
| Contact | Phone (normalized) | High |
| Contact | High | |
| Company | Tax ID | Critical |
| Company | Name | Medium (fuzzy) |
| Deal | Name + Company | Low |
Phone normalization is critical: +7 (495) 123-45-67 and 84951234567 are the same number. Bitrix24 normalizes phone numbers automatically via CPhoneNumber::Normalize(), but only if the field type is set to "Phone" — not free-form text.
Bulk Check of Existing Database
After enabling deduplication, existing duplicates are not addressed — the system only checks new records. For a one-time cleanup, use the built-in tool: CRM → Contacts → More → Find Duplicates.
For databases exceeding 100k records, the built-in tool is too slow. A more practical approach is to query via REST and process in batches:
// Search for duplicates by email via REST API
$result = CRest::call('crm.duplicate.findbycomm', [
'type' => 'EMAIL',
'values' => ['[email protected]'],
'entity_type' => 'CONTACT',
]);
// Returns an array of IDs of potential duplicates
Typical cleanup cycle: export all contacts with email → group by normalized email → for each group of 2+ records, run crm.merge with defined field priority rules.
Automated Merging via REST
If the process runs regularly (for example, every night after a lead import), automate it with a Bitrix24 agent or an external cron script:
// crm.merge — merge two contacts
CRest::call('crm.contact.merge', [
'id' => 1001, // master record (kept)
'victims' => [1002, 1003], // records to be absorbed
]);
Before running a merge, define the rule for selecting the "master" record — typically the one with more activities or an earlier creation date.
Common Problems and Solutions
False duplicates on company name. "LLC Romashka" and "Romashka LLC" are treated as different by Bitrix24. Add normalization via the OnBeforeCRMCompanyAdd hook to strip legal entity suffixes from the name.
Duplicates from different lead sources. A client who submitted a web form and also called in generates two leads with different data sets. Configure lead-to-contact merging via a business process that checks for duplicates at the conversion stage.
Performance on large databases. Indexes on the b_crm_contact and b_crm_company tables for the PHONE and EMAIL fields are mandatory. Verify their presence in the database, especially after migrations.
Deduplication is not a one-time task — it is an ongoing process. Set up a weekly report on the number of potential duplicates and keep the metric under control.







