Gas Price Prediction System Development
The real task sounds like this: a user wants to execute a transaction, needs to know how much it will cost in 30 minutes, in 2 hours, and in 12 hours. Simple answer "check current baseFee" doesn't work — it changes every 12 seconds and over long horizons is useless. You need a system that predicts future gas prices with reasonable accuracy for practical decisions.
How gas pricing works on Ethereum (EIP-1559)
After EIP-1559 in August 2021, gas pricing became two-component:
-
baseFee — algorithmically determined base fee, burned forever. Changes maximum ±12.5% block to block depending on whether previous block was more or less than 50% filled (
target_gas_used = block_gas_limit / 2). - maxPriorityFee (tip) — miner/validator tips. User sets it themselves, market determines minimum acceptable level.
-
maxFeePerGas — maximum user is willing to pay. Actually charged:
baseFee + min(tip, maxFeePerGas - baseFee).
Base fee change formula:
baseFee_new = baseFee_old * (1 + 0.125 * (gas_used - target_gas) / target_gas)
This is key: baseFee is deterministically computed from on-chain data. If you know gas utilization of each block, you can precisely reconstruct historical baseFee and build a model.
Data collection
Minimal dataset for each block:
interface BlockGasData {
blockNumber: bigint;
timestamp: number;
baseFeePerGas: bigint;
gasUsed: bigint;
gasLimit: bigint;
utilizationRate: number; // gasUsed / gasLimit
// from transactions in block:
medianPriorityFee: bigint;
p25PriorityFee: bigint;
p75PriorityFee: bigint;
p95PriorityFee: bigint;
txCount: number;
mempoolSizeAtBlock?: number; // if mempool data available
}
Historical data: to train model you need at least 3–6 months (different market conditions, bull/bear periods, events like NFT mints and token launches creating gas spikes). Faster to get via Alchemy/QuickNode Archive API or public datasets (Dune Analytics, Google BigQuery Ethereum dataset).
Real-time data: WebSocket subscription to newHeads, for each block additionally request eth_getBlockByNumber with true to get transactions (need priority fees). On loaded networks — significant data volume, needs rate-limit-aware polling.
Mempool data — optional but valuable signal. Pending transactions show "demand pressure" before confirmation. Available via eth_getFilterChanges or Mempool.space API, Blocknative streaming.
Temporal patterns
Gas prices have pronounced seasonality:
Intraday: activity UTC 13:00–21:00 (EU + US business hours overlap) consistently higher than UTC 02:00–10:00 (Asia night + US night). Difference can be 3–5x on baseFee.
Day of week: Friday and weekends — NFT and gaming activity peaks. Monday–Wednesday — DeFi and institutional operations.
Event spikes: major NFT mint, token airdrop claim, protocol launch — gas can spike 10–50x in minutes. Can't predict specific spike, but can detect its beginning and quickly update forecast.
Prediction models
Short-term forecast (1–10 blocks, ~12–120 seconds)
Deterministic model: next baseFee is computed exactly from current + current utilization. For 5–10 blocks can apply Markov chain based on historical utilization patterns.
def predict_next_basefee(current_basefee: int, utilization: float) -> int:
change = 0.125 * (utilization - 0.5) # -0.0625 to +0.0625
return int(current_basefee * (1 + change))
This is deterministic for next block. For 5–10 block horizon use Monte Carlo simulation with utilization distribution from historical data.
Medium-term forecast (10 min – 2 hours)
Determinism ends here, ML begins.
XGBoost / LightGBM work well for tabular data:
- Features: current baseFee, rolling average for 10/30/60 blocks, time of day (sin/cos encoding), day of week, pending tx count in mempool, recent utilization trend
- Target: baseFee after N blocks
LSTM / Transformer — better capture long-term patterns, but harder to maintain. For practical system, gradient boosting often suffices.
Quality metric: not RMSE, but practical — what % of time user with recommended gas gets into next block vs. overpays vs. gets stuck.
Long-term forecast (2–48 hours)
At these horizons temporal seasonality dominates. Prophet (Facebook) handles daily and weekly patterns well:
from prophet import Prophet
model = Prophet(
daily_seasonality=True,
weekly_seasonality=True,
changepoint_prior_scale=0.05
)
model.fit(df[["ds", "y"]]) # ds=timestamp, y=basefee_gwei
forecast = model.predict(future_df)
Practical accuracy on 24h horizon: ±30–50% of median value. Enough to advise "tomorrow morning UTC gas will be significantly lower than now".
Recommendations for specific scenarios
System should convert forecast into actionable recommendations:
interface GasRecommendation {
scenario: "fast" | "standard" | "economy";
maxFeePerGas: bigint; // in wei
maxPriorityFee: bigint; // in wei
estimatedInclusionTime: number; // seconds
confidence: number; // 0–1
usdCostFor21000Gas: number; // for simple transfer
}
"Economy" scenario: "if not in hurry — wait until UTC 04:00, gasWei will be ~40% of current". Use historical percentiles for hourly segments.
API and integration
Provide prediction results via REST API:
GET /v1/gas/current — current prices + short-term forecast
GET /v1/gas/forecast?hours=24 — forecast for period
GET /v1/gas/recommend?speed=economy — recommendation for scenario
WS /v1/gas/stream — updates every block
Cache results: current data — TTL 12 sec (one block), short-term forecast — TTL 1 min, long-term — TTL 15 min. Redis.
Realistic development timeline for system with ML forecasting and API: 8–12 weeks.







