Development of real-time smart contract exploit detection system
Euler Finance hack in March 2023 — $197M in several transactions. BNB Bridge hack — $570M in one transaction. In both cases protocols had enough time (several blocks, sometimes multiple transactions) to notice anomaly and stop next transaction. But automated detection system did not exist.
Real-time monitoring of smart contracts — system that analyzes each transaction before or during its inclusion in block, and can initiate protective response (contract pause, whitelist enforcement, alert) faster than attack completes.
There are three time windows for detection: mempool (transaction sent, not included — earliest, but only pending txs), block execution (transaction included in block but block not finalized — not applicable for chains with instant finality), post-block (block finalized — too late for preventive action, only for alert and post-mortem).
System architecture for monitoring
Mempool-level detection
Earliest detection moment — mempool. Transaction sent by attacker but not included in block yet. For classic attacks (not MEV-bundle through private mempool) this gives 1-12 seconds on Ethereum (time until next block).
Private mempool limitation. Most MEV attacks on DeFi today go through private mempool (Flashbots, MEV Blocker, etc.) — transactions directly to validator, not public mempool. Mempool monitoring doesn't see such transactions until block inclusion. This is system limitation.
Nevertheless, many protocol-level attacks (especially multi-step: first borrow, then dump, then drain) pass through public mempool at least partially.
Blocking call through own node:
// Subscribe to pending transactions via WebSocket
const provider = new ethers.WebSocketProvider(ALCHEMY_WS_URL);
provider.on("pending", async (txHash) => {
try {
const tx = await provider.getTransaction(txHash);
if (!tx || !tx.to) return;
// Check only transactions to our contracts
if (!MONITORED_CONTRACTS.has(tx.to.toLowerCase())) return;
const risk = await analyzeTransaction(tx);
if (risk.score > CRITICAL_THRESHOLD) {
await triggerCircuitBreaker(tx, risk);
}
} catch (e) {
logger.error("Mempool analysis error", e);
}
});
Ethereum Mempool API (Blocknative, Bloxroute) provide more reliable mempool access with address filtering. Costs money but much more reliable than self-hosted node.
Transaction simulation
Before deciding on risk need to understand what transaction does — not just its calldata. eth_call with current state allows simulating execution without sending to network:
async function simulateTransaction(tx: TransactionRequest): Promise<SimulationResult> {
// Tenderly API for detailed trace with state changes
const simulation = await tenderly.simulate({
network_id: "1",
from: tx.from,
to: tx.to,
input: tx.data,
value: tx.value?.toString() ?? "0",
save: false,
});
return {
success: simulation.transaction.status,
gasUsed: simulation.transaction.gas_used,
stateChanges: simulation.transaction.transaction_info.state_diff,
events: simulation.transaction.transaction_info.logs,
balanceChanges: extractBalanceChanges(simulation),
};
}
Tenderly, Alchemy Simulate, Blocknative provide simulation API. Key insight: simulation shows all state changes before execution. If simulation shows protocol balance will drop >10% in one transaction — this is anomaly.
Invariant checking
After simulation check set of protocol invariants:
interface ProtocolInvariant {
name: string;
check: (stateBefore: ProtocolState, stateAfter: ProtocolState) => boolean;
severity: "critical" | "high" | "medium";
}
const INVARIANTS: ProtocolInvariant[] = [
{
name: "TVL_DROP_THRESHOLD",
check: (before, after) => {
const tvlChange = (after.tvl - before.tvl) / before.tvl;
return tvlChange > -0.10; // not more than 10% drop per transaction
},
severity: "critical",
},
{
name: "PRICE_IMPACT_LIMIT",
check: (before, after) => {
if (!after.lastSwap) return true;
return Math.abs(after.lastSwap.priceImpact) < 0.20; // < 20%
},
severity: "high",
},
{
name: "BORROW_UTILIZATION",
check: (before, after) => {
return after.borrowUtilization < 0.95; // < 95%
},
severity: "high",
},
{
name: "FLASH_LOAN_IN_PROGRESS",
check: (before, after) => {
// Flash loan itself not exploit, but in combination with other signs
return !after.hasActiveFlashLoan || after.flashLoanRepaid;
},
severity: "medium",
},
];
Circuit breaker integration
Detecting attack insufficient — need mechanism to stop. Options:
Pause Guardian. Special address (multisig or automated guardian contract) with right to call pause() on protocol. Monitoring system can have privileged key to call pause. Risk: compromise of this key = DoS protocol. Mitigation: pause guardian can only pausenot unpause (unpause requires Governor + Timelock).
On-chain circuit breaker. Contract with logic to automatically pause upon invariant violation:
contract CircuitBreaker {
uint256 public constant MAX_TVL_DROP_BPS = 1000; // 10%
uint256 public lastTVL;
bool public paused;
modifier checkCircuit() {
_;
uint256 currentTVL = getTVL();
if (lastTVL > 0) {
uint256 dropBps = (lastTVL - currentTVL) * 10000 / lastTVL;
if (dropBps > MAX_TVL_DROP_BPS) {
paused = true;
emit CircuitBreakerTriggered(lastTVL, currentTVL, dropBps);
}
}
lastTVL = currentTVL;
}
function deposit(uint256 amount) external checkCircuit {
require(!paused, "Circuit breaker active");
// ... deposit logic
}
}
On-chain circuit breaker doesn't require off-chain infrastructure but adds gas overhead to each transaction and increases attack surface (can getTVL() be manipulated to artificially trigger circuit breaker?).
Defender Relayer (OpenZeppelin Defender). Defender allows configuring automatic actions: upon detecting anomalous event — Relayer calls pause() from privileged address. Defender stores private key in HSM, automation configured via UI or code.
ML-based anomaly detection
Rule-based invariants catch known patterns. ML suited for detecting unknown anomalies.
Feature engineering for on-chain transactions
Features for transaction classification:
| Feature | Description | Importance |
|---|---|---|
gas_used / gas_limit |
High gas usage — complex transaction | High |
value_transferred / pool_tvl |
Volume relative to pool liquidity | Critical |
call_depth |
Depth of nested calls | High |
unique_contracts_touched |
How many contracts called | High |
flash_loan_amount |
Flash loan flag and volume | High |
time_since_last_tx |
Anomalously fast sequential transactions | Medium |
sender_age |
New address — higher suspicion | Medium |
token_price_delta |
Token price change during transaction | High |
Anomaly detection models
Isolation Forest — works well for multivariate anomaly detection without labeled attack data. Trains on normal transactions, flags outliers.
LSTM Autoencoder — for sequence anomalies: series of transactions that are anomalous overall. Important for multi-step attacks (several transactions constitute attack).
Gradient Boosting (XGBoost/LightGBM) — if labeled attack data exists (known historical exploits). Requires class balance (attacks rare), SMOTE for oversampling.
Training data: Forta Network, Dune Analytics, DeBank have historical transaction data. Known exploit transactions (Euler, Ronin, BNB Bridge) — negative class. Normal trading — positive class.
Latency constraint. ML inference must fit in ~200ms for mempool detection. Simple models (Isolation Forest, logistic regression) — ok. LSTM on CPU — ~50-100ms for inference. If faster needed — GPU inference or quantized models (ONNX runtime).
Alert infrastructure integration
Alert routing
Detected anomaly must reach right person quickly. Stack:
- PagerDuty / OpsGenie — for critical alerts (potential attack). Phone call to on-call duty officer at 3 AM.
- Telegram / Discord bot — for high/medium alerts. Monitoring bot channel.
- Grafana dashboard — real-time metrics: TVL, transaction volume, price impact, circuit breaker status.
async function routeAlert(alert: Alert) {
if (alert.severity === "critical") {
await pagerduty.triggerIncident({
title: `CRITICAL: ${alert.name} detected`,
body: formatAlertBody(alert),
severity: "critical",
});
// Auto-initiate pause if confidence > 0.9
if (alert.confidence > 0.9 && alert.autoActionEnabled) {
await pauseGuardian.pause(alert.transactionHash);
}
}
// Always send to Discord for audit
await discord.send(ALERTS_CHANNEL, formatDiscordAlert(alert));
// Metrics to Prometheus
metrics.increment("alerts_total", { severity: alert.severity, type: alert.name });
}
Forta Network integration
Forta — decentralized monitoring network. Developers deploy detection bots (Node.js or Python) that receive each transaction and generate alerts. Bot-runner nodes execute bots and publish alerts on-chain.
Advantage: no need for own node infrastructure. Disadvantage: latency (post-block), no mempool monitoring, latency depends on Forta infrastructure.
For custom protocol: Forta bot as additional layer on top of own monitoring — redundancy important.
Production infrastructure
Node infrastructure
For mempool monitoring need reliable WebSocket connection to Ethereum node. Self-hosted archive node (Geth, Reth) on bare metal gives lowest latency but requires 2+ TB SSD and maintenance. Managed: Alchemy, QuickNode — reliable but WebSocket can throttle under high load.
For production: dual-provider setup. Primary (e.g., Alchemy) + fallback (QuickNode). Automatic switch on degraded connection.
Scalability
When monitoring 10+ protocols on several chains:
- Horizontal scaling: separate worker per chain
- Message queue (Kafka, RabbitMQ) between detector and action layers
- State management: Redis for caching protocol state (TVL, prices) — avoid repeated RPC requests
| Component | Technology |
|---|---|
| Mempool monitoring | Node.js + ethers.js v6 + WS provider |
| Transaction simulation | Tenderly API / Alchemy Simulate |
| ML inference | Python FastAPI + ONNX runtime |
| Alert routing | PagerDuty + Telegram bot |
| Dashboard | Grafana + Prometheus |
| Pause automation | OpenZeppelin Defender |
| Redundancy | Forta Network bots |
Development timeline
MVP (rule-based detection + alerts + manual pause): 4-6 weeks.
Full system (ML detection + automated circuit breaker + Forta integration + dashboard): 3-5 months.
Important: monitoring system itself requires security review. Compromise of automated pause guardian can be used for DoS attack on protocol — freeze protocol at critical moment. Defense: rate limiting on pause calls, multisig for unpause, transparency log of all automated actions.







