Archive Node Setup

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Archive Node Setup
Medium
~3-5 business days
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1252
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1170
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    873
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1092
  • image_logo-advance_0.png
    B2B Advance company logo design
    563
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    830

Setting Up an Archive Node

A regular full node (full node) stores only the current blockchain state plus a few thousand recent blocks. This is sufficient for most operations: sending transactions, reading current balances, calling view functions of contracts. But if you need to know an address's balance at block 15,000,000 or a contract's state from two years ago — a full node will return an error. For this you need an archive node.

An archive node stores the state trie for every block in network history. This requires colossal disk space:

Network Full Node Archive Node
Ethereum mainnet ~1 TB ~18+ TB (2024)
BSC ~1.5 TB ~3+ TB
Polygon PoS ~600 GB ~8+ TB
Arbitrum One ~300 GB ~4+ TB
Optimism ~200 GB ~2+ TB

When You Really Need an Archive Node

Before deploying an archive — check if an alternative works:

  • For historical data analytics — often The Graph subgraph or Dune Analytics is enough, they already have indexed historical data
  • For one-time queries — you can use paid archive RPC (Alchemy, Infura, QuickNode with archive access)
  • For regular production queries — then your own archive node is justified

You need your own archive node when: high volume of archive queries makes paid RPC expensive, you need low latency, you need query privacy, you need custom tracing (debug_traceTransaction, trace_block).

Ethereum Archive Node (Erigon)

Geth archive takes 18+ TB and is historically slow. Erigon — an alternative implementation that stores archive data much more compactly (~3 TB for Ethereum) due to different storage structure and staged syncing:

# Install Erigon
git clone https://github.com/ledgerwatch/erigon
cd erigon
make erigon

# Or via release binary
wget https://github.com/erigontech/erigon/releases/latest/download/erigon_linux_amd64.tar.gz

Configuring Erigon for Archive

erigon \
  --datadir=/data/erigon \
  --chain=mainnet \
  --prune='' \             # empty string = archive (keep everything)
  --http \
  --http.addr=0.0.0.0 \
  --http.port=8545 \
  --http.api=eth,erigon,web3,net,debug,trace,txpool \
  --ws \
  --ws.port=8546 \
  --torrent.download.rate=512mb \   # speeds up initial download via BitTorrent
  --metrics \
  --metrics.addr=0.0.0.0 \
  --metrics.port=6060

Key: --prune='' means archive mode. If you specify --prune=hrtc — this is full node mode with aggressive pruning.

Erigon Sync Stages

Erigon syncs in stages, allowing resumption after interruption:

Stage 1: Headers         (~1 GB, fast)
Stage 2: Block Bodies    (~500 GB, slow)
Stage 3: Senders         (sender recovery from signatures)
Stage 4: Execution       (execute all transactions — longest, 3-7 days)
Stage 5: Hash State      (build state trie)
Stage 6: Intermediate Hashes
Stage 7: History Index   (index of state changes by block)
Stage 8: Log Index
Stage 9: Tx Lookup
Stage 10: Finish

Full Ethereum archive sync on NVMe: 5–14 days depending on hardware.

Server Requirements for Ethereum Archive

CPU:  16+ cores (Execution stage is CPU-intensive)
RAM:  64 GB (32 GB minimum, will swap and be slow)
Disk: 4 TB NVMe SSD (not HDD, not SATA SSD — only NVMe)
Net:  1 Gbps

Arbitrum / Optimism Archive (Nitro / OP Stack)

L2 archive nodes are technically simpler and more compact:

# Arbitrum Nitro archive
docker run --rm -it \
  -v /data/arbitrum:/home/user/.arbitrum \
  -p 8547:8547 \
  offchainlabs/nitro-node:latest \
  --node.chain-id=42161 \
  --parent-chain.connection.url=$L1_RPC_URL \
  --http.api=eth,web3,net,arb,debug \
  --node.archive \     # archive mode
  --init.url=https://snapshot.arbitrum.io/mainnet/nitro.tar  # snapshot for quick start

Arbitrum provides official snapshots to speed up initial sync.

Reth: Alternative for Ethereum

reth (Rust Ethereum) — new implementation from Paradigm, very fast syncing:

reth node \
  --datadir /data/reth \
  --chain mainnet \
  --full \              # archive by default in reth
  --http \
  --http.api "eth,net,web3,debug,trace" \
  --ws

Reth shows faster sync than Erigon on some configurations, but less mature in stability (2024).

Monitoring and Health Check

# Check that node is in archive mode — request old balance
curl -X POST http://localhost:8545 \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0",
    "method":"eth_getBalance",
    "params":["0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045", "0xF4240"],
    "id":1
  }'
# Block 0xF4240 = 1000000. If returns balance — archive works.
# If returns "missing trie node" error — this is full node, not archive.

Prometheus + Grafana Monitoring

Erigon and Reth export metrics in Prometheus format. Key metrics for dashboard:

  • erigon_stages_progress — progress of each sync stage
  • chain_head_block — current block
  • p2p_peers — number of peers
  • system_disk_free — free space (archive node actively grows)

Set up alert on system_disk_free < 500 GB — without warning, node will crash due to full disk.

Snapshot Storage and Backups

Resyncing an archive node from scratch — loss of several days. Backups are critical:

# Erigon: stop node, create snapshot via cp or rsync
systemctl stop erigon
rsync -avz --progress /data/erigon/ backup-server:/backup/erigon-$(date +%Y%m%d)/
systemctl start erigon

# Or use LVM snapshot if server is on LVM
lvcreate -L100G -s -n erigon-snap /dev/vg0/erigon-data

Incremental backups via rclone to S3-compatible storage are cheaper than full copy every time.