Multi-blockchain data aggregation system

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1306 services

Multi-blockchain data aggregation system

Complex

from 2 weeks to 3 months

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1309
Development of a web application for FEEDME
1222
Website development for BELFINGROUP
922
Development of an online store for the company FURNORO
1151
B2B Advance company logo design
614
Development of a web application for Enviok
887

Show more works

Multi-blockchain Data Aggregation System Development

The task looks simple: "collect data from multiple blockchains". In practice, it's one of the most technically complex tasks in Web3 infrastructure. Each network is a separate data model, its own finalization logic, its own RPC API, its own rate limits, and its own specific errors. Ethereum operates in UTC with ~12-second blocks, Solana delivers ~400ms slots and counts confirmations differently, TON has a sharded architecture where "block" is a conditional concept. Collecting all this into a unified API with consistent data is a nontrivial engineering task.

The problem of heterogeneity: why you can't just "query all RPC"

Different data models

EVM networks (Ethereum, Arbitrum, Polygon, BSC) share a common model: blocks, transactions, receipts with logs. But there are differences even here:

Arbitrum adds l1BlockNumber and specific system transactions (sequencer batch submissions)
Optimism/Base have depositedTx type for L1→L2 transactions, which don't have standard from
zkSync Era uses Native AA — no distinction between EOA and contracts, all accounts are contracts

Solana is a completely different paradigm: no "transaction invoked contract method" — instead "instructions in transaction passed to programs". To decode you need an ABI analog — IDL (Interface Definition Language, Anchor format).

UTXO models (Bitcoin, Litecoin) are fundamentally different: no account balances, there are unspent outputs. "Address balance" is the sum of all UTXO where this address is output.

Different finalization semantics

Network	Mechanism	Finality
Ethereum	PoS + Casper FFG	~15 min (finalized checkpoint)
Arbitrum One	Optimistic Rollup	~7 days (fraud proof window) for L1 finality
Polygon PoS	Heimdall checkpoints	~30 min for Ethereum finality
Solana	Tower BFT	~12-32 slots (~6–16 sec)
Bitcoin	PoW	6 confirmations (~60 min) — conventional standard

If the system doesn't account for this, data will be incorrect: a transaction will appear "final" by confirmation count but get reorganized.

Aggregation system architecture

Collector layer (Chain Collectors)

Each collector is an isolated service responsible for one network. Common interface:

interface ChainCollector {
  getLatestBlock(): Promise<UnifiedBlock>;
  getBlockRange(from: bigint, to: bigint): Promise<UnifiedBlock[]>;
  getTransactionsByAddress(address: string, fromBlock: bigint): Promise<UnifiedTx[]>;
  subscribeNewBlocks(callback: (block: UnifiedBlock) => void): Unsubscribe;
}

Unified types normalize each network's specifics:

interface UnifiedTx {
  chain: ChainId;
  hash: string;
  blockNumber: bigint;
  timestamp: number; // unix
  from: string;      // normalized lowercase hex for EVM, base58 for Solana
  to: string | null;
  value: bigint;     // in smallest units of native token
  status: 'success' | 'failed' | 'pending';
  finality: 'unconfirmed' | 'safe' | 'finalized';
  raw: unknown;      // original network data
}

Node and provider management

Problem: public RPC is unreliable, rate limits are unpredictable, Alchemy/Infura get expensive at scale.

Strategy: tiered provider pool

Primary: Own nodes (Geth+Lighthouse, Reth for archive)
  ↓ failover
Secondary: Alchemy / QuickNode (premium tier)
  ↓ failover  
Tertiary: Infura / public RPC (only for non-critical requests)

Circuit breaker on each provider: if error rate > 5% over 60 sec or latency > 2x p99 baseline — remove provider from rotation, health check every 30 sec.

For archive data (historical blocks > 128 blocks ago on Ethereum) you need an archive node — this is a separate story. Erigon takes ~3TB for full Ethereum archive, Reth slightly less. For most projects it's cheaper to use Alchemy Archive or QuickNode Archive than maintain own node.

Normalization and transformation layer

Raw blockchain data is rarely needed as-is. Typical transformations:

Decoding ERC-20 Transfer events

const ERC20_TRANSFER_TOPIC = 
  "0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef";

function decodeTransfer(log: Log): TokenTransfer | null {
  if (log.topics[0] !== ERC20_TRANSFER_TOPIC) return null;
  return {
    token: log.address,
    from: `0x${log.topics[1].slice(26)}`,
    to: `0x${log.topics[2].slice(26)}`,
    amount: BigInt(log.data),
  };
}

Token data enrichment: for each log.address you need to know symbol, decimals, USD price. Cache token metadata in Redis with TTL 24h, update prices every 30 sec from CoinGecko/CoinMarketCap.

Cross-chain aggregation: if you need to show "total address balance across all networks in USD", you need to normalize different decimals, convert through price feeds, handle wrapped versions of same token (USDC on Ethereum ≠ USDC.e on Arbitrum).

Storage layer

For hot data (last 7–30 days): PostgreSQL with partitioning by chain_id + date. Indexes on (chain_id, address, block_number) and (chain_id, tx_hash). TimescaleDB hypertables for large data — automatic compression of old partitions.

For cold data (archive): ClickHouse — columnar database, order of magnitude more efficient than PostgreSQL for analytical queries over large periods. Query "all USDC transactions > $10k during 2023 across all EVM networks" on 100M+ rows — ClickHouse gives result in seconds, PostgreSQL in minutes.

For address/hash search: ElasticSearch or just PostgreSQL with LIKE — hash index sufficient for exact matches.

Reorg handling

This is the trickiest part of the system. Algorithm:

Save each block with flag is_canonical = true and parent_hash
New block with same block_number but different hash — potential reorg
Walk back via parent_hash until common ancestor is found
Mark all blocks on "old" branch as is_canonical = false, add blocks from "new" branch
Output API always filters by is_canonical = true
Webhooks/downstream systems receive tx.orphaned events for revoked transactions

For Ethereum reorg depth is rarely > 2 blocks post-Merge. For Polygon PoS — seen reorgs of 30+ blocks. Observation buffer: 128 blocks for EVM networks.

API layer

REST + WebSocket for real-time:

GET /v1/address/{address}/transactions?chains=eth,arb,polygon&limit=50
GET /v1/tx/{chain}/{hash}
GET /v1/address/{address}/token-balances?chains=eth,bsc
WS  /v1/subscribe?address={addr}&chains=eth,arb&events=transfer,swap

GraphQL is convenient if clients need flexibility in queries: one request gets transactions + balances + token metadata. But adds complexity on backend — N+1 problems, needs DataLoader.

Rate limiting: per-API-key, sliding window, separate limits for REST and WebSocket (WebSocket connections are more expensive). Redis + Lua script for atomic increments.

Monitoring and operations

Critical metrics:

Collector lag — difference between latest block timestamp in network and processing time in our system. Alert when lag > 2 min.
Reorg depth — max reorg depth over last 24h. Alert when depth > 10.
RPC error rate — per provider and method. Alert when > 1%.
Queue depth — if processor can't keep up with collector, queue grows. Alert when depth > 10k messages.

Grafana dashboard with per-chain panels: current block, lag, TPS, error rate.

Stack

Component	Technology
Collectors	Node.js (viem/ethers) + Go for high-load networks
Queue	Apache Kafka (high throughput) or RabbitMQ (moderate)
Hot storage	PostgreSQL 15 + TimescaleDB
Cold storage	ClickHouse
Cache	Redis Cluster
API	Node.js (Fastify) or Go (Fiber)
Monitoring	Prometheus + Grafana + PagerDuty
Orchestration	Kubernetes with HPA on collectors

Realistic MVP timeline (3–4 EVM networks, no archive, REST API): 8–12 weeks. Complete system with 10+ networks, ClickHouse, WebSocket, monitoring — 5–7 months.