Trading Bot Uptime Monitoring Setup

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Trading Bot Uptime Monitoring Setup
Simple
~1 business day
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1217
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1046
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

Setting up trading bot uptime monitoring

A trading bot that went down at 3 AM and nobody noticed — this isn't just missed trades. It's open positions without management, missed stop-losses, and potentially significant losses by the time someone notices. Bot uptime monitoring is not Grafana for pretty charts, it's a system that will wake you up before market does it painfully.

What to monitor

Bot uptime is not just "process running". Process can be alive but bot not trading. Need three levels of checks:

Process alive — most basic. Process running, not frozen.

Application alive — bot actually processing data. Heartbeat: bot regularly writes last activity timestamp. If timestamp not updated N minutes — something's wrong.

Trading alive — bot not just working but trading as expected. Metrics: orders placed per period, P&L, open positions match strategy.

Tools

Healthcheck endpoint. Simplest and most reliable approach — add HTTP endpoint to bot:

from fastapi import FastAPI
import asyncio
import time

app = FastAPI()
last_heartbeat = time.time()
bot_state = {"status": "running", "last_trade": None, "open_positions": 0}

@app.get("/health")
async def health():
    age = time.time() - last_heartbeat
    if age > 60:  # not updated more than minute
        return {"status": "stale", "heartbeat_age_seconds": age}, 503
    return {"status": "ok", **bot_state}

# In bot main loop
async def bot_loop():
    global last_heartbeat
    while True:
        last_heartbeat = time.time()
        await run_strategy()
        await asyncio.sleep(5)

Uptime Kuma — self-hosted UptimeRobot analog. Checks HTTP endpoint every N seconds, sends notifications to Telegram/Discord/PagerDuty on unavailability. Deploys in 5 minutes on Docker:

docker run -d --restart=always -p 3001:3001 \
  -v uptime-kuma:/app/data louislam/uptime-kuma:1

For bot: Monitor Type = HTTP, URL = http://your-bot-host:8080/health, interval = 30 seconds, expected status = 200.

Better Uptime / PagerDuty — if you need SLA guarantees and escalation policies.

Telegram alert directly from bot. Fastest way to get notification — bot sends message on error:

import httpx

async def notify_telegram(message: str):
    await httpx.AsyncClient().post(
        f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
        json={"chat_id": CHAT_ID, "text": message}
    )

# In exception handler
try:
    await run_strategy()
except Exception as e:
    await notify_telegram(f"Bot crashed: {e}")
    raise

Prometheus + Grafana for metrics

For more detailed monitoring — export metrics to Prometheus:

from prometheus_client import Counter, Gauge, start_http_server

orders_placed = Counter('bot_orders_placed_total', 'Total orders placed', ['symbol', 'side'])
open_positions = Gauge('bot_open_positions', 'Number of open positions')
pnl_gauge = Gauge('bot_unrealized_pnl_usd', 'Unrealized PnL in USD')
last_heartbeat_gauge = Gauge('bot_last_heartbeat_timestamp', 'Last heartbeat timestamp')

# In bot code
orders_placed.labels(symbol='BTCUSDT', side='buy').inc()
last_heartbeat_gauge.set(time.time())

start_http_server(9090)  # Prometheus scrape endpoint

Alerting rules in Prometheus:

groups:
  - name: trading_bot
    rules:
      - alert: BotHeartbeatStale
        expr: time() - bot_last_heartbeat_timestamp > 120
        for: 1m
        annotations:
          summary: "Trading bot heartbeat stale for {{ $value }}s"
          
      - alert: BotNoOrders
        expr: rate(bot_orders_placed_total[30m]) == 0
        for: 30m
        annotations:
          summary: "Bot placed no orders in 30 minutes"

Watchdog process

If bot itself can't send alert (process dead) — need external watchdog. Simplest option with bash and crontab:

#!/bin/bash
# /usr/local/bin/bot-watchdog.sh

HEALTH_URL="http://localhost:8080/health"
TELEGRAM_TOKEN="..."
CHAT_ID="..."

response=$(curl -s -o /dev/null -w "%{http_code}" --max-time 10 "$HEALTH_URL")

if [ "$response" != "200" ]; then
    curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_TOKEN}/sendMessage" \
      -d "chat_id=${CHAT_ID}" \
      -d "text=ALERT: Trading bot health check failed (HTTP ${response})"
fi
# Crontab: every minute
* * * * * /usr/local/bin/bot-watchdog.sh

Automatic restart

systemd — if bot runs as systemd service:

[Unit]
Description=Trading Bot
After=network.target

[Service]
ExecStart=/usr/bin/python3 /opt/bot/main.py
Restart=on-failure
RestartSec=10
StartLimitIntervalSec=60
StartLimitBurst=3

[Install]
WantedBy=multi-user.target

Restart=on-failure — auto-restart on crash. StartLimitBurst=3 — max 3 restarts per 60 seconds (crash loop protection).

Docker restart policy: --restart=unless-stopped or --restart=on-failure:3.

What's included in setup

Setup takes 1-2 days: adding healthcheck endpoint to bot (or adapting existing), deploying Uptime Kuma / setting up external monitoring, configuring Telegram alerts, watchdog script, automatic restart through systemd/Docker, basic Prometheus metrics if needed for trading activity analytics.