1C-Bitrix Disaster Recovery Monitoring and Testing

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Showing 1 of 1 servicesAll 1626 services

Simple

~1-2 weeks

FAQ

Our competencies:

Free consultation

Book a free consultation if you have any questions. A dedicated specialist will advise you.

Cost calculation

If you know what exactly you need to develop, or you already have a ready-made technical task.

Development stages

Latest works

B2B ADVANCE company website development
1222
Website development for FIXPER company
831
Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
577
Development based on 1C Enterprise for MIRSANBEL
747
Website development on CRM Bitrix24 for DOLBIMBY
657
Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
980

Show more works

Monitoring and Testing Disaster Recovery for 1C-Bitrix

A written recovery plan without regular verification does not work. The team does not know the real RTO, backups may be corrupted, and the production configuration may have changed since the last drill. DR monitoring is not just observing the current state — it is regularly confirming that the recovery plan can be executed within the declared timeframe.

What to Monitor in the Context of DR

Backup State

Monitor not just whether a backup was created, but its integrity:

#!/bin/bash
# Check the latest DB dump
BACKUP_FILE="/backups/db/bitrix_$(date +%Y%m%d).sql.gz"
MIN_SIZE=104857600  # 100 MB — minimum expected size

if [ ! -f "$BACKUP_FILE" ]; then
    echo "CRITICAL: Backup file not found: $BACKUP_FILE"
    exit 2
fi

FILE_SIZE=$(stat -c%s "$BACKUP_FILE")
if [ "$FILE_SIZE" -lt "$MIN_SIZE" ]; then
    echo "CRITICAL: Backup too small: ${FILE_SIZE} bytes"
    exit 2
fi

# Check gzip integrity
if ! gzip -t "$BACKUP_FILE" 2>/dev/null; then
    echo "CRITICAL: Backup file is corrupted"
    exit 2
fi

echo "OK: Backup size ${FILE_SIZE} bytes, integrity OK"

This script runs in Nagios/Zabbix/Prometheus as an external check. An alert fires if the backup is missing, too small, or corrupted.

DB Replication

-- Seconds_Behind_Master > 300 — alert
SHOW SLAVE STATUS\G

In Zabbix — via zabbix_get with a MySQL agent or a custom UserParameter:

# zabbix_agentd.conf
UserParameter=mysql.slave.lag,mysql -u monitor -pXXX -e "SHOW SLAVE STATUS\G" 2>/dev/null | grep "Seconds_Behind_Master" | awk '{print $2}'

Free Space on the Backup Server

# Warning when <20% free space remains
df -h /backups | awk 'NR==2 {gsub(/%/,""); if ($5 > 80) print "WARNING: disk " $5 "% used"}'

Recovery Endpoint Availability

A simple healthcheck on the backup server, monitored from both the primary DC and an external monitoring service:

// /health.php on the backup server
<?php
header('Content-Type: application/json');

$checks = [];

// Check DB availability
try {
    $pdo = new PDO('mysql:host=127.0.0.1;dbname=bitrix_db', 'bitrix_ro', '***');
    $pdo->query("SELECT 1");
    $checks['db'] = 'ok';
} catch (Exception $e) {
    $checks['db'] = 'fail';
}

// Check Redis
$redis = new Redis();
$checks['redis'] = $redis->connect('127.0.0.1', 6379) ? 'ok' : 'fail';

// Check Bitrix filesystem
$checks['files'] = file_exists('/var/www/bitrix/bitrix/php_interface/dbconn.php') ? 'ok' : 'fail';

$status = in_array('fail', $checks) ? 503 : 200;
http_response_code($status);
echo json_encode(['status' => $status === 200 ? 'ok' : 'degraded', 'checks' => $checks]);

Regular DR Drills: Methodology

Quarterly drill — full restore to an isolated test stand:

Take the latest DB and file backup
Deploy to a clean server
Time each stage
After restore — run an automated smoke test

#!/bin/bash
# dr_smoke_test.sh — runs after restore
BASE_URL="https://test-recovery.example.com"

check() {
    local name="$1"
    local url="$2"
    local expected="$3"

    response=$(curl -sf --max-time 30 "$url")
    if echo "$response" | grep -q "$expected"; then
        echo "PASS: $name"
    else
        echo "FAIL: $name — expected '$expected' not found"
        FAILED=1
    fi
}

check "Homepage" "$BASE_URL/" "1C-Bitrix"
check "Catalog" "$BASE_URL/catalog/" "Catalog"
check "Cart API" "$BASE_URL/bitrix/components/bitrix/sale.basket.basket/" "basket"
check "Health endpoint" "$BASE_URL/health.php" '"status":"ok"'

[ -z "$FAILED" ] && echo "All checks passed" || echo "Some checks FAILED"

Monthly drill — DB-only restore. Verify dump currency: restore to a test server, run queries against b_sale_order, b_iblock_element, b_catalog_price — confirm that the data is current (latest records not older than the RPO).

-- Check data freshness after restore
SELECT MAX(DATE_INSERT) as latest_order FROM b_sale_order;
-- Should not be older than RPO (e.g., not older than 4 hours)

SELECT COUNT(*) FROM b_iblock_element WHERE ACTIVE = 'Y';
-- Compare with the expected number of active products

DR Metrics and SLA

Metric	Target value	How it is measured
DB backup: age of last valid backup	< RPO (e.g. 4 h)	Monitoring + file timestamp
Replication: Seconds_Behind_Master	< 60 s under normal conditions	Zabbix/Prometheus
Drill duration (full restore)	Compared against RTO	Timed at each drill
Successful drills per quarter	≥ 1	Testing log
File backup age	< 24 h	rsync monitoring

DR Reporting

After each drill, record:

Date and time of drill
Plan version (revision number)
Time for each recovery stage
Actual RTO vs planned RTO
Issues discovered during the drill
Plan updates following the drill

This log is not a formality. It reveals trends: whether RTO is degrading over time (the site grows, backups become larger, the procedure is not updated).

DR Monitoring Setup Timeline

Setting up backup monitoring, replication checks, and healthcheck endpoints with integration into Zabbix/Prometheus, plus the first drill with an automated smoke test — 3–5 business days.

1C Bitrix presentation 1C Bitrix24 presentation 1C Enterprise presentation