Server Availability Monitoring for 1C-Bitrix
Your site may be perfectly written, but if the server is unavailable — the client sees ERR_CONNECTION_REFUSED. Server availability monitoring — this is a layer below site monitoring. Here we check not HTTP response from application, but machine itself: network, disk, memory, processes. For Bitrix this is especially relevant: the platform is resource-intensive, and typical VPS on shared hosting runs at the edge.
Monitoring Levels
Server monitoring — not one check, but several levels, each catching its own class of problems.
Ping (ICMP). Most basic check: server responds to ping — means machine is on and network works. Doesn't respond — either server is down, or network unavailable, or firewall blocks ICMP (then ping is useless). Interval — 30-60 seconds.
Ports. Check TCP connection on key ports:
| Port | Service | What unavailability means |
|---|---|---|
| 80 | HTTP | Web server not running |
| 443 | HTTPS | Web server or SSL not working |
| 3306 | MySQL | Database unavailable |
| 5432 | PostgreSQL | Database unavailable |
| 22 | SSH | No remote access |
Port responds, but service inside hangs — TCP handshake passes, but no data. For HTTP this is caught via HTTP monitoring (TTFB > threshold). For MySQL — via specialized checks.
System Metrics. CPU, RAM, disk, swap, load average. Collected by agent on server (Zabbix agent, Telegraf, node_exporter for Prometheus). Critical thresholds for typical Bitrix server:
| Metric | Warning | Critical |
|---|---|---|
| CPU usage | > 80% (5 min) | > 95% (5 min) |
| RAM usage | > 85% | > 95% |
| Disk usage | > 80% | > 90% |
| Swap usage | > 50% | Any usage |
| Load average | > cores | > 2x cores |
Swap on Bitrix server — alarming signal. MySQL and PHP-FPM in swap work an order of magnitude slower. If swap grows — not enough RAM, need to optimize or add.
Processes: What Should Run
For typical Bitrix configuration (Apache/Nginx + PHP-FPM + MySQL) monitor process presence:
-
nginx —
systemctl is-active nginx. If crashed — 502 Bad Gateway. -
php-fpm —
systemctl is-active php8.1-fpm. If crashed — 502 or 500. -
mysqld —
systemctl is-active mysql. If crashed — white screen or database connection error. -
cron —
systemctl is-active cron. If crashed — Bitrix agents don't execute (when usingcron_events).
Additionally for production configuration:
-
memcached or redis — if used as cache backend (
cache_typein.settings.php). Cache server failure doesn't down site, but dramatically increases MySQL load. - sphinx or elasticsearch — if external search index is used.
Tools
Zabbix — full monitoring system. Zabbix agent installs on server, collects metrics, sends to Zabbix server. Ready templates for Linux, MySQL, Nginx, PHP-FPM. Trigger setup: {host:system.cpu.util.avg(5m)} > 80 — alert when CPU > 80% for 5 minutes.
Prometheus + Grafana — alternative stack. Node_exporter collects metrics, Prometheus stores, Grafana visualizes. Alertmanager sends notifications. More flexible than Zabbix for visualization, but requires more setup.
Netdata — agent with web interface, installs in a minute. Shows metrics in real-time with per-second detail. No built-in history storage (needs Prometheus/Graphite integration). Suitable for quick diagnostics.
Control Panel (ISPmanager, VestaCP). If server managed via panel — usually has basic monitoring: CPU, RAM, disk graphs. Alerts — usually absent.
Bitrix Server Specifics
Bitrix Environment (BitrixVM). If server deployed from BitrixVM image — pre-installed own monitoring scripts: /etc/cron.d/bx_*, status check via /opt/webdir/bin/bx-monitor. These scripts check MySQL, Nginx, PHP-FPM and send results to BitrixVM panel (https://server:8443). Basic monitoring, but no external notifications.
MySQL. For Bitrix critical: Threads_connected (current connections), Slow_queries (slow queries), Innodb_buffer_pool_reads (cache misses). Monitor via Zabbix MySQL template or mysqladmin extended-status.
/upload/ Size. Directory /upload/ on Bitrix sites grows uncontrollably: catalog images, 1C exchange files, temp files. Monitor du -sh /home/bitrix/www/upload/ — if approaching disk limit, cleanup needed via Bitrix tool Settings → File Cleanup or manual review.
Backups. Monitor not only server, but its backups. Check last backup date (file in /backup/ directory or log entry). If backup older than 24 hours — warning. Data loss without fresh backup — catastrophe that monitoring must prevent.
Notification Channels and Escalation
For server monitoring, response speed more critical than for application. Server unavailable — all sites on it are down. Chain: Telegram (instantly) → SMS (in 5 minutes without confirmation) → call (in 15 minutes). For teams — PagerDuty or OpsGenie integration with on-call schedule.







