What data sources are used for TVL and APY?

We use The Graph subgraphs (Uniswap, Curve, Aave), on-chain queries via Multicall3, and DeFi Llama’s public API. Each has its own latency and limits, so we cross-validate all data.

How often is the data updated?

Update frequency depends on the source: The Graph every 5–10 minutes, on-chain in real-time via event subscriptions, DeFi Llama every few minutes. Historical data is stored in TimescaleDB with block-level precision.

How do you handle RPC provider rate limits?

We batch eth_call using Multicall3 and control concurrency with p-limit. On Alchemy’s free tier (300 CUPS), one Multicall3 batch consumes 20 CU instead of 10–40 for individual calls, reducing costs by up to 60%.

What’s the difference between APR and APY in DeFi?

APR is a simple interest rate; APY accounts for compounding. For lending protocols, APR already includes compounding; for LP positions it does not. We calculate both and also show APY net of impermanent loss.

What does your DeFi data collection service include?

We build a scraper for your selected protocols and chains, normalize data into a uniform schema, set up TimescaleDB storage, provide REST/GraphQL API with caching, document endpoints, offer one month support, and deliver deployment scripts and a private dashboard.

What data sources are used for TVL and APY?

We use The Graph subgraphs (Uniswap, Curve, Aave), on-chain queries via Multicall3, and DeFi Llama’s public API. Each has its own latency and limits, so we cross-validate all data.

How often is the data updated?

Update frequency depends on the source: The Graph every 5–10 minutes, on-chain in real-time via event subscriptions, DeFi Llama every few minutes. Historical data is stored in TimescaleDB with block-level precision.

How do you handle RPC provider rate limits?

We batch eth_call using Multicall3 and control concurrency with p-limit. On Alchemy’s free tier (300 CUPS), one Multicall3 batch consumes 20 CU instead of 10–40 for individual calls, reducing costs by up to 60%.

What’s the difference between APR and APY in DeFi?

APR is a simple interest rate; APY accounts for compounding. For lending protocols, APR already includes compounding; for LP positions it does not. We calculate both and also show APY net of impermanent loss.

What does your DeFi data collection service include?

We build a scraper for your selected protocols and chains, normalize data into a uniform schema, set up TimescaleDB storage, provide REST/GraphQL API with caching, document endpoints, offer one month support, and deliver deployment scripts and a private dashboard.

DeFi Data Aggregation: TVL, APY, Pool Liquidity from Multiple Sources

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1305 services

DeFi Data Aggregation: TVL, APY, Pool Liquidity from Multiple Sources

Medium

~2-3 days

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1360
Development of a web application for FEEDME
1251
Website development for BELFINGROUP
957
Development of an online store for the company FURNORO
1188
B2B Advance company logo design
646
Development of a web application for Enviok
929

Show more works

We regularly encounter the challenge of collecting TVL and APY from DeFi protocols. On the surface, it seems simple: pull the data, store it in a database, serve it via an API. But in practice, each protocol uses its own calculation logic — part of the data lives on-chain, part in subgraphs with delays, and APY is recalculated every block. Many protocols deploy multiple versions on different chains with incompatible ABIs. Moreover, RPC providers throttle request frequency, and The Graph subgraphs can become stale (The Graph Protocol). Without a well-thought-out scraping architecture, instead of clean data, you end up with a mess of missing data points, inflated TVL due to manipulated prices, and incorrect APY. Our team has 5+ years of experience in DeFi data engineering, having delivered 50+ scraping projects for leading protocols.

Data Sources and Their Peculiarities

The Graph: Primary Source for Aggregated Data

Most major protocols have official subgraphs: Uniswap, Curve, Aave, Compound, Balancer, Yearn. The Graph Studio allows querying historical and current data via GraphQL.

Problems we encounter:

Latency. Subgraphs update with a 1–10 minute delay after on-chain events. Not suitable for real-time monitoring, but fine for historical data and dashboards.
Stale subgraphs. The Uniswap v2 subgraph hasn't been maintained by the team for a long time; data may be incomplete. For Uniswap v3, the official subgraph periodically lags during high volume.
Pagination. The Graph returns a maximum of 1000 records per query. To fetch all Uniswap v3 pools (over 50,000), you need pagination using skip or id_gt pattern.

query GetPools($lastId: String) {
  pools(first: 1000, where: { id_gt: $lastId }, orderBy: id) {
    id
    token0 { symbol, decimals }
    token1 { symbol, decimals }
    totalValueLockedUSD
    volumeUSD
    feeTier
  }
}

TVL nuance in The Graph. The Uniswap v3 subgraph calculates TVL as the sum of token values in USD using an internal price feed. This price feed sometimes gives incorrect values for low-liquidity tokens — a pool with a real TVL of $500k might show as $50M due to a manipulated price of one token. This needs to be cross-checked with an external source.

On-Chain Queries for Accurate Data

For data that needs to be exact and current — direct eth_call to contracts:

Aave v3 TVL: Pool.getReserveData(asset) returns aToken.totalSupply() * liquidityIndex. For each asset in each market.
Curve APY: Minter.minted(gauge, user) for CRV emission, gauge.inflation_rate() for current rate. Real APY = (crv_per_year * crv_price) / gauge_tvl_usd.
Uniswap v3 fee APY: positions.tokensOwed0/1 — accumulated fees. For general pool APY: pool.feeGrowthGlobal0X128 — delta over period / liquidity.

Multicall3 (0xcA11bde05977b3631167028862bE2a173976CA11) is deployed on all major chains, allowing batching hundreds of eth_call into one transaction. Instead of 100 individual RPC requests — one batch. For scraping, this is critical for performance. Multicall3 is 10 times more efficient than sequential eth_call.

DeFi Llama API

https://api.llama.fi — public API without key for TVL data of most protocols. Data structure:

GET /tvl/{protocol}           → current TVL
GET /protocol/{protocol}      → historical TVL + breakdown
GET /pools                    → APY for all pools (~10k records)

/pools is a goldmine: it already calculates APY for thousands of pools across all chains. However, DeFi Llama updates data every few minutes — for real-time tasks, you need your own calculation.

Source	Latency	TVL Accuracy	Request Cost
The Graph	1–10 min	Medium (manipulation risk)	Free (1000 requests/day)
On-chain (eth_call)	Real-time	High	Gas, RPC limits
DeFi Llama	few min	Medium	Free, no key

How to Normalize Data from Different Protocols?

Each protocol returns data in its own format. Normalization is key. We convert everything to a unified schema: { protocolId, chainId, poolAddress, tvlUsd, apy, timestamp }. A unified schema enables cross-protocol comparisons. For this, we use Node.js scripts with TypeScript and the ethers.js library. Our normalizer employs a schema-agnostic adapter pattern to handle protocol-specific quirks.

Scraping System Architecture

Data Collection Layers

Scheduler (cron / event-driven)
  ├── GraphQL Fetcher (The Graph subgraphs)
  ├── On-chain Fetcher (Multicall3 + ethers.js)
  ├── HTTP Fetcher (DeFi Llama, CoinGecko)
  └── WebSocket Listener (real-time events)
        ↓
  Normalizer (unified format)
        ↓
  TimescaleDB / PostgreSQL
        ↓
  API (REST/GraphQL)

The Normalizer is the key component. Each protocol returns data in its own format. Normalization: { protocolId, chainId, poolAddress, tvlUsd, apy, timestamp }. A unified schema enables cross-protocol comparisons.

APY Calculation

APY = Annual Percentage Yield with compounding. For most DeFi protocols, raw data is APR (without compounding), which needs to be converted:

APY = (1 + APR/n)^n - 1, where n is the number of compounding periods per year.

For lending protocols, APR usually already includes compounding (Aave v3 uses liquidityRate). For LP positions, it does not: fees are accrued without reinvestment.

Real APY components for a Uniswap v3 LP position:

Trading fees APR (depends on volume and position range)
Liquidity mining rewards (if incentives exist)
Minus impermanent loss (historical estimate)

Why APY Without Subtracting IL Is Misleading?

An honest APY without subtracting impermanent loss shows inflated returns. In reality, with a wide range LP position, IL can eat up to 80% of profits. We show both numbers: fee APY and fee APY minus IL.

Error Handling and Rate Limiting

Alchemy free tier: 300 CUPS (compute units per second). One eth_call = 10–40 CU, Multicall3 batch = 20 CU regardless of the number of queries inside. We batch as much as possible.

The Graph: 1000 requests per day on the free plan. We use a cache with TTL — most data doesn't need to be refreshed more often than every 5 minutes.

Retry with exponential backoff on all HTTP requests. Dead letter queue for failed fetches — we don't lose data during temporary RPC failures.

Component	Source	Peculiarities
Trading fees	On-chain / The Graph	Depends on volume and range
Rewards	Merkle drop / incentives	Requires monitoring
IL	On-chain price	Estimated historically

Tech Stack

TypeScript + Node.js for scrapers. PostgreSQL + TimescaleDB for time-series storage. Redis for caching intermediate data. Docker Compose for local development.

ethers.js v6 for on-chain interactions. graphql-request for The Graph queries. p-limit for concurrency control (don't hammer RPC providers). TimescaleDB hypertables enable efficient time-series queries over millions of records.

Typical Mistakes When Scraping DeFi Data

Using only one source without validation. TVL from an unverified subgraph can be 10x higher than the real value due to manipulated prices.
Not accounting for impermanent loss when calculating APY for LP positions.
Ignoring rate limits — the scraper fails with errors, data is lost.
Storing all data in a single table without partitioning — historical queries become slow.

What the Work Includes

Analysis of protocols and data sources for your task.
Development of a scraper with normalization and validation.
Setup of TimescaleDB with TTL and partitioning.
REST/GraphQL API with caching.
Documentation of data structure and endpoints.
One month of support after delivery.
Deployment scripts and access to a private dashboard.
Guaranteed data accuracy with cross-validation from multiple sources.

Timeline Estimates

A scraper for 2–3 protocols on a single chain with a basic API: 2–3 days. A multi-protocol, multi-chain system with historical data and normalization: 1–2 weeks, depending on the number of sources and APY accuracy requirements. Typical project cost ranges from $2,000 to $10,000.

We will evaluate your project and offer a turnkey solution. Contact us to discuss the details. Order development of a DeFi data scraper.

DeFi Protocol Development

We design modular DeFi protocols where the math of stablecoins, liquidity, and oracles works flawlessly. Mango Markets is a stress test: the attacker manipulated the spot price through a single account, took a loan against inflated collateral, and withdrew $114 million. The oracle took the price from a single source without TWAP. Not a code bug—it was an architectural decision that became a vulnerability. Our experience shows: any DeFi protocol is a system of bets that all components, from calculations to economic incentives, are correctly aligned simultaneously.

We don't write code under the 'if it works, don't touch it' mindset. We model stress scenarios: cascading liquidations, depegs, flash loans. Only then do we build events that won't break the protocol.

Why are oracles a critical component of DeFi?

Most major DeFi hacks started with oracle manipulation. Let's break down the three layers we use in every project.

Spot price as oracle—not an option. Uniswap v2 spot price can be shifted by a flash loan in one transaction. The price at the end of the block is the only one that enters the state, and the oracle reads it. Attack scheme: borrow via flash loan → buy asset into the pool → price rises → take a loan against inflated collateral → sell asset → repay flash loan. One transaction.

TWAP as protection. Uniswap v3 observe() averages the price over a period (30 minutes). Manipulation requires maintaining the price for several blocks—this is expensive. But TWAP reacts slowly to legitimate changes, opening a window for arbitrage on liquidation during sharp movements.

Chainlink Price Feeds are an aggregation from multiple data providers with a median. Standard for lending. Problem: heartbeat 1–24 hours and deviation threshold 0.5%. If the price doesn't move, the feed may not update for a day. In volatile markets—lag.

Oracle	Mechanism	Manipulation Protection	Latency
Chainlink	Median from independent providers	High (decentralization)	Up to 24h at 0% movement
Uniswap v3 TWAP	Average price over N blocks	High (hard to maintain)	30 min – 1 h
Pyth Network	Cross-chain low-latency	Medium (dependent on publisher)	Seconds

In production, we use a two-tier check: Chainlink aggregator + Uniswap v3 TWAP as a verifier. If the discrepancy exceeds N%, the transaction is rejected and the system is paused.

How to protect a DeFi protocol from flash loan attacks?

Flash loans turn any user into an owner of unlimited capital for one transaction. Therefore, when designing contracts, we assume: everyone has access to unlimited capital. This completely changes the threat model.

Legitimate uses of flash loans are arbitrage, liquidation, and self-liquidation. But the protocol must verify that the loan is not used for manipulation: the oracle must not read the price from a pool that can be shifted in one transaction. We add checks on block.timestamp and minimum liquidity depth.

Key Components of DeFi Architecture

Protocol Type	Core Mechanism	Main Risk
DEX (AMM)	x*y=k or concentrated liquidity	impermanent loss, oracle manipulation
Lending	collateral ratio, liquidation	bad debt during cascading liquidations
Yield aggregator	auto-compounding strategies	rug via strategy upgrade
Derivatives / Perps	funding rate, mark price	liquidation cascades, socialized losses
Liquid staking	stETH-style rebasing	depegging on mass unstake

AMM: From x*y=k to Concentrated Liquidity

Uniswap v2 uses x * y = k. LP tokens are ERC-20—each pool issues its own token proportional to the share. Problem: liquidity is spread across the entire curve, most of it unused.

Uniswap v3 and ERC-721 positions: concentrated liquidity—LPs provide liquidity in a range [priceLow, priceHigh]. Capital efficiency up to 4000x for stable pairs. But ERC-721 breaks vault strategies built for ERC-20. Range management is a separate engineering challenge: a position falls out of range when the price moves, stops earning fees, and becomes single-asset. Protocols like Arrakis Finance automatically rebalance. If you build a vault on top of v3, you need your own range manager or integration with an existing one.

Slippage in v3 is calculated via sqrtPriceX96—96-bit fixed-point math. Errors on the frontend lead to discrepancies between visible and actual slippage.

Curve for pairs with close prices (stablecoin/stablecoin, stETH/ETH) uses an invariant combining constant product and constant sum. Lower slippage within the peg range. Contracts are in Vyper, code is mathematically dense, auditing is difficult.

Lending Protocols: Collateral, Liquidation, Bad Debt

LTV defines the maximum loan against collateral. Liquidation threshold is the level for liquidation. The difference is the buffer for the liquidator. Typical example: LTV 75%, liquidation threshold 80%, bonus 5%. If the price drops 20%+, the position is open for liquidation.

Cascading liquidations: many positions are liquidated simultaneously → liquidators sell collateral → price drops → next wave. LUNA/UST 2022 is a classic cascade.

If collateral devalues faster than liquidation, the protocol incurs bad debt. Aave uses a Safety Module (staked AAVE), Compound uses reserves. Without a backstop, bad debt is socialized via dilution of the supply token or netting.

Designing a liquidation system requires modeling stress scenarios: a single liquidation bot failure, high gas, collateral delisting.

Yield Farming and Incentive Mechanics

Liquidity mining distributes governance tokens to LP providers. Problem: mercenary capital—farmers come, sell tokens, leave. TVL is illusory.

Sustainable mechanics: protocol-owned liquidity (Olympus bonding), veToken (CRV locked → boost + governance), locked staking with penalty. The ve-model, if implemented incorrectly, creates governance concentration. A timelock on gauge weight changes and limits on voting power are needed.

What Our DeFi Protocol Development Includes

Architectural documentation: contract interaction diagrams, liquidation stress tests, oracle calculations.
Implementation in Solidity 0.8.x with OpenZeppelin 5.x (AccessControl, ReentrancyGuard, Pausable, TimelockController) and Solmate for gas-optimized base contracts.
Foundry fork tests on real mainnet (Uniswap, Chainlink, Aave) — pre-deployment tests cover all scenarios.
Audit: at least two independent auditors for TVL over $1M. Code4rena or Sherlock for bug bounty.
Deployment with Gnosis Safe 3/5 multisig + timelock 48–72 hours.
Monitoring via Tenderly (alerts, simulations), OpenZeppelin Defender (automation), Forta (on-chain threat detection).
Post-launch support: updates, patches, upgrades via proxy.

Our Expertise and Experience

We have been developing DeFi protocols since 2020, delivering 30+ projects with a combined TVL of over $150 million. Our clients include protocols in the top 20 by TVL on Ethereum, Arbitrum, and Base. The team consists of certified Solidity developers who have completed ConsenSys Diligence audit tracks.

DeFi basic principles that we apply in practice.

Timelines

DEX with AMM (Uniswap v2 fork): 6–10 weeks
Lending protocol (Aave-style, single collateral): 3–5 months
Yield aggregator with multiple strategies: 2–4 months
Full-fledged DeFi protocol with governance: 5–8 months including audit

Cost is calculated individually—contact us for a project estimate.

Get a consultation on DeFi protocol architecture—we will analyze the risks and propose an optimal solution.