Whale Transactions Data Scraping

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Whale Transactions Data Scraping
Medium
~2-3 business days
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1217
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1046
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

Scraping Whale Transaction Data

A "whale" in on-chain analysis context — address with asset or transaction volume significant relative to overall market liquidity. Transfer of 50,000 ETH from exchange wallet to cold wallet creates price pressure and information signal. Monitoring such movements is a practical task for trading systems, risk management, and on-chain analytics.

What to Track Exactly

Not all large transactions are equally informative. Key patterns:

Exchange inflow/outflow: large transfer to exchange (inflow) — potential sale. Transfer from exchange (outflow) — accumulation or move to self-custody. For correct interpretation, need exchange address database.

Cross-chain bridges: large movements via bridges (Arbitrum bridge, Stargate, LayerZero) signal liquidity movement between networks.

DeFi events: large liquidity withdrawal from Uniswap pool, large loan repayment in Aave, opening/closing large position on GMX.

Stablecoin mint/burn: Tether and Circle print/burn USDT/USDC based on fiat deposits. Large mint — potential capital inflow to market.

Ethereum: Monitoring via eth_getLogs and WebSocket

Monitor large ERC-20 transfers in real-time via WebSocket subscription to Transfer events with filtering by size already in application (blockchain level doesn't support value-based filtering):

import asyncio
from web3 import AsyncWeb3, WebSocketProvider
from web3.middleware import ExtraDataToPOAMiddleware

WHALE_THRESHOLD_USDT = 500_000 * 10**6  # 500k USDT
USDT_ADDRESS = "0xdAC17F958D2ee523a2206206994597C13D831ec7"

async def monitor_usdt_whales():
    w3 = AsyncWeb3(WebSocketProvider("wss://eth-mainnet.g.alchemy.com/v2/YOUR_KEY"))
    
    transfer_filter = await w3.eth.filter({
        'address': USDT_ADDRESS,
        'topics': [w3.keccak(text="Transfer(address,address,uint256)").hex()]
    })
    
    async for event in transfer_filter.get_new_entries():
        amount = int(event['data'], 16)
        if amount >= WHALE_THRESHOLD_USDT:
            from_addr = '0x' + event['topics'][1].hex()[26:]
            to_addr = '0x' + event['topics'][2].hex()[26:]
            
            await process_whale_transfer({
                'from': from_addr,
                'to': to_addr,
                'amount_usdt': amount / 10**6,
                'tx_hash': event['transactionHash'].hex(),
                'block': event['blockNumber'],
            })

For native ETH — separate logic via eth_getBlockByNumber with full_transactions=True and filtering by value:

async def scan_block_for_whale_eth(block_number: int, threshold_eth: float):
    block = await w3.eth.get_block(block_number, full_transactions=True)
    threshold_wei = w3.to_wei(threshold_eth, 'ether')
    
    whale_txns = [
        tx for tx in block.transactions
        if tx['value'] >= threshold_wei
    ]
    
    return whale_txns

Bitcoin: UTXO Model

Bitcoin has no Transfer events. Tracking large transactions — via mempool and block monitoring. Bitcoin Core RPC:

import bitcoinrpc

rpc = bitcoinrpc.connect_to_local()

def find_whale_transactions(block_hash: str, threshold_btc: float):
    block = rpc.getblock(block_hash, verbosity=2)
    whale_txns = []
    
    for tx in block['tx']:
        # Sum of all outputs
        total_output = sum(
            vout['value'] 
            for vout in tx['vout'] 
            if vout.get('scriptPubKey', {}).get('type') != 'OP_RETURN'
        )
        
        if total_output >= threshold_btc:
            whale_txns.append({
                'txid': tx['txid'],
                'total_btc': total_output,
                'outputs': tx['vout'],
                'input_count': len(tx['vin']),
            })
    
    return whale_txns

Labeling: Who is Who

Raw address 0x28C6c06298d514Db089934071355E5743bf21d60 carries no meaning. Value appears with labels — knowledge base of which entity owns the address.

Label sources:

  • Arkham Intelligence — commercial database with entity labels
  • Etherscan tags — community-submitted labels, available via API
  • Dune Analytics — community datasets (known exchange addresses, protocols)
  • Custom database — built up during on-chain activity analysis

Typical label database structure:

CREATE TABLE address_labels (
    address TEXT NOT NULL,
    chain TEXT NOT NULL,
    entity_name TEXT,        -- 'Binance', 'Coinbase', 'Jump Trading'
    entity_type TEXT,        -- 'exchange', 'market_maker', 'fund', 'whale'
    confidence SMALLINT,     -- 1-100
    source TEXT,
    verified BOOLEAN DEFAULT FALSE,
    PRIMARY KEY (address, chain)
);

Aggregation and Storage

Whale events should be stored with context for subsequent analysis:

CREATE TABLE whale_events (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    chain TEXT NOT NULL,
    tx_hash TEXT NOT NULL,
    block_number BIGINT,
    block_time TIMESTAMPTZ NOT NULL,
    from_address TEXT NOT NULL,
    to_address TEXT NOT NULL,
    token_address TEXT,      -- NULL for native coin
    amount_raw NUMERIC,
    amount_usd NUMERIC,
    from_label TEXT,
    to_label TEXT,
    event_type TEXT,         -- 'exchange_inflow', 'exchange_outflow', 'defi_exit', etc.
    notified BOOLEAN DEFAULT FALSE
);

CREATE INDEX ON whale_events (block_time DESC);
CREATE INDEX ON whale_events (from_address, block_time DESC);

Notifications

Telegram bot or Discord webhook for real-time alerts. Message format with maximum informativeness:

🐋 WHALE ALERT — Ethereum
💰 50,000,000 USDT ($50.0M)
📤 Binance (0x28C6...21d60)
📥 Unknown Wallet (0xF9e...3a14)
🔗 tx: 0x7f8...b2c
⏱ 12 seconds ago | Block 19,847,231

Custom thresholds for different assets and event types — configurable via admin interface or env file.

Ready-Made Services vs Custom Parser

Whale Alert, Lookonchain, Arkham have free and paid tiers with ready alerts. Custom parser is justified when: custom logic needed (specific contracts, specific patterns), data used in trading system with latency requirements, or integration with proprietary label database needed.

Developing whale transaction monitoring system for ETH + BTC with Telegram alerts, label database of 5000+ addresses, and history storage — 2–3 weeks.