NFT Collection Data Scraping (Floor Price, Volume, Holders)

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
NFT Collection Data Scraping (Floor Price, Volume, Holders)
Medium
~2-3 business days
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1214
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

NFT Collection Data Scraping (floor price, volume, holders)

OpenSea API returns floor price with 5-15 minute delay and aggregates data by their methodology. For trading bots, analytics platforms and minting dApps that need real floor — this is unacceptable. Only path to accurate data: read events directly from blockchain.

Data Sources: Where to Get What

On-chain events

For ERC-721/ERC-1155 collections all sales visible via marketplace events. Each marketplace emits its own event:

  • OpenSea Seaport: OrderFulfilled(bytes32 orderHash, address offerer, address zone, address recipient, SpentItem[] offer, ReceivedItem[] consideration) — contract 0x00000000000000ADc04C56Bf30aC9d3c0aAF14dC
  • Blur: TakerAsk / TakerBid on 0x000000000000Ad05Ccc4F10045630fb830B95127
  • LooksRare v2: TakerAsk / TakerBid
  • X2Y2: EvInventory

Floor price can't be obtained from events directly — events show executed orders, not active listings. For current floor you need either index active listings via marketplace API or use aggregators.

Holders and Transfers

Transfer(address indexed from, address indexed to, uint256 indexed tokenId) — ERC-721 standard. Complete ownership graph built via replay of all Transfer events from deployment block. Unique holders = unique to addresses minus addresses that later transferred tokens elsewhere.

For ERC-1155: TransferSingle and TransferBatch. Here ownership is balance, not binary state: balanceOf(address, tokenId).

Parser Architecture

Stack

ethereum-node (Alchemy/Infura/Quicknode) 
  → ethers.js / viem (event filtering)
    → message queue (Redis Streams / BullMQ)
      → PostgreSQL / ClickHouse (storage)
        → REST/WebSocket API (data delivery)

For historical data — getLogs with filter by address and topics[0]. Batch blocks by 2000 (limit on most RPC providers for eth_getLogs):

async function fetchTransferEvents(
  contract: string,
  fromBlock: number,
  toBlock: number,
  provider: JsonRpcProvider
) {
  const iface = new Interface(['event Transfer(address indexed from, address indexed to, uint256 indexed tokenId)']);
  const filter = {
    address: contract,
    topics: [iface.getEventTopic('Transfer')],
    fromBlock,
    toBlock,
  };
  const logs = await provider.getLogs(filter);
  return logs.map(log => iface.parseLog(log));
}

For real-time: WebSocket subscription via provider.on(filter, callback) or Alchemy eth_subscribe newLogs.

Computing Floor Price

Two approaches:

1. Marketplace API aggregation — request floor from OpenSea, Blur, LooksRare, take minimum. Problem: rate limits and caching on API side.

2. Orderbook indexing — subscribe to order creation/cancellation events. Seaport: OrderValidated (creation), OrderCancelled, OrderFulfilled (execution). Build local orderbook, compute floor yourself. More accurate, but harder to maintain with contract updates.

For most tasks first approach with 60 second cache enough.

Storage and Queries

ClickHouse more efficient than PostgreSQL for time-series NFT data — analytics queries on millions of rows 10–50x faster. Schema:

Column Type Description
block_number UInt64 Event block
tx_hash FixedString(66) Transaction hash
contract FixedString(42) Collection address
token_id UInt256 Token ID
from FixedString(42) Seller/sender
to FixedString(42) Buyer/recipient
price_wei UInt256 Price in wei
marketplace LowCardinality(String) Marketplace
timestamp DateTime Block time

Partition by months (toYYYYMM(timestamp)), sort key (contract, timestamp).

Solving Typical Problems

Rate limits: Alchemy Free — 330 CUPS, Growth — 660 CUPS. On historical parsing of large collection (BAYC: 500k+ Transfer events) without throttling you get 429. Implement exponential backoff + queue with concurrency control.

Blockchain reorganizations: events from last 12 blocks should be marked as "pending" and confirmed only after finality. For Ethereum PoS — 2 epochs (64 blocks) for economic finality.

Wash trading: volume by addresses with circular transfers distorts stats. Basic heuristic: trades where from and to are related addresses (got ETH from one source) marked with flag.

Timeline Estimates

Transfer events parser + holders tracker — 1 day. Adding floor price via marketplace API + cache — another half day. Historical backfill for large collection + dashboard — 2-3 days total.