Monitoring 1C-Bitrix server availability

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1173
  • image_bitrix-bitrix-24-1c_fixper_448_0.png
    Website development for FIXPER company
    811
  • image_bitrix-bitrix-24-1c_development_of_an_online_appointment_booking_widget_for_a_medical_center_594_0.webp
    Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
    564
  • image_bitrix-bitrix-24-1c_mirsanbel_458_0.webp
    Development based on 1C Enterprise for MIRSANBEL
    745
  • image_crm_dolbimby_434_0.webp
    Website development on CRM Bitrix24 for DOLBIMBY
    655
  • image_crm_technotorgcomplex_453_0.webp
    Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
    976

Server Availability Monitoring for 1C-Bitrix

Your site may be perfectly written, but if the server is unavailable — the client sees ERR_CONNECTION_REFUSED. Server availability monitoring — this is a layer below site monitoring. Here we check not HTTP response from application, but machine itself: network, disk, memory, processes. For Bitrix this is especially relevant: the platform is resource-intensive, and typical VPS on shared hosting runs at the edge.

Monitoring Levels

Server monitoring — not one check, but several levels, each catching its own class of problems.

Ping (ICMP). Most basic check: server responds to ping — means machine is on and network works. Doesn't respond — either server is down, or network unavailable, or firewall blocks ICMP (then ping is useless). Interval — 30-60 seconds.

Ports. Check TCP connection on key ports:

Port Service What unavailability means
80 HTTP Web server not running
443 HTTPS Web server or SSL not working
3306 MySQL Database unavailable
5432 PostgreSQL Database unavailable
22 SSH No remote access

Port responds, but service inside hangs — TCP handshake passes, but no data. For HTTP this is caught via HTTP monitoring (TTFB > threshold). For MySQL — via specialized checks.

System Metrics. CPU, RAM, disk, swap, load average. Collected by agent on server (Zabbix agent, Telegraf, node_exporter for Prometheus). Critical thresholds for typical Bitrix server:

Metric Warning Critical
CPU usage > 80% (5 min) > 95% (5 min)
RAM usage > 85% > 95%
Disk usage > 80% > 90%
Swap usage > 50% Any usage
Load average > cores > 2x cores

Swap on Bitrix server — alarming signal. MySQL and PHP-FPM in swap work an order of magnitude slower. If swap grows — not enough RAM, need to optimize or add.

Processes: What Should Run

For typical Bitrix configuration (Apache/Nginx + PHP-FPM + MySQL) monitor process presence:

  • nginxsystemctl is-active nginx. If crashed — 502 Bad Gateway.
  • php-fpmsystemctl is-active php8.1-fpm. If crashed — 502 or 500.
  • mysqldsystemctl is-active mysql. If crashed — white screen or database connection error.
  • cronsystemctl is-active cron. If crashed — Bitrix agents don't execute (when using cron_events).

Additionally for production configuration:

  • memcached or redis — if used as cache backend (cache_type in .settings.php). Cache server failure doesn't down site, but dramatically increases MySQL load.
  • sphinx or elasticsearch — if external search index is used.

Tools

Zabbix — full monitoring system. Zabbix agent installs on server, collects metrics, sends to Zabbix server. Ready templates for Linux, MySQL, Nginx, PHP-FPM. Trigger setup: {host:system.cpu.util.avg(5m)} > 80 — alert when CPU > 80% for 5 minutes.

Prometheus + Grafana — alternative stack. Node_exporter collects metrics, Prometheus stores, Grafana visualizes. Alertmanager sends notifications. More flexible than Zabbix for visualization, but requires more setup.

Netdata — agent with web interface, installs in a minute. Shows metrics in real-time with per-second detail. No built-in history storage (needs Prometheus/Graphite integration). Suitable for quick diagnostics.

Control Panel (ISPmanager, VestaCP). If server managed via panel — usually has basic monitoring: CPU, RAM, disk graphs. Alerts — usually absent.

Bitrix Server Specifics

Bitrix Environment (BitrixVM). If server deployed from BitrixVM image — pre-installed own monitoring scripts: /etc/cron.d/bx_*, status check via /opt/webdir/bin/bx-monitor. These scripts check MySQL, Nginx, PHP-FPM and send results to BitrixVM panel (https://server:8443). Basic monitoring, but no external notifications.

MySQL. For Bitrix critical: Threads_connected (current connections), Slow_queries (slow queries), Innodb_buffer_pool_reads (cache misses). Monitor via Zabbix MySQL template or mysqladmin extended-status.

/upload/ Size. Directory /upload/ on Bitrix sites grows uncontrollably: catalog images, 1C exchange files, temp files. Monitor du -sh /home/bitrix/www/upload/ — if approaching disk limit, cleanup needed via Bitrix tool Settings → File Cleanup or manual review.

Backups. Monitor not only server, but its backups. Check last backup date (file in /backup/ directory or log entry). If backup older than 24 hours — warning. Data loss without fresh backup — catastrophe that monitoring must prevent.

Notification Channels and Escalation

For server monitoring, response speed more critical than for application. Server unavailable — all sites on it are down. Chain: Telegram (instantly) → SMS (in 5 minutes without confirmation) → call (in 15 minutes). For teams — PagerDuty or OpsGenie integration with on-call schedule.