Scraping data from DEX aggregators (1inch, Jupiter)
1inch and Jupiter don't just give best price — they publish routing logic through API. This is valuable data source: which pools used for pair, how volume splits between protocols, what price impact at different order sizes. For analytical systems, MEV-bots, arbitrage strategies and research tools — this is raw material.
1inch API: what's really available
Swap API vs Fusion API
Swap API (/swap/v6.0/{chain}/swap) — classic aggregation. Request returns:
-
txobject with full transaction data -
protocols— list of protocols in route with shares -
toAmount— minimum output amount
For scraping price data without execution — use /quote endpoint: no slippage parameter, no fromAddress, returns only quote. Doesn't create RPC load and doesn't require permissions.
Fusion API (/fusion/v1.0/{chain}/quote/receive) — different model: RFQ (request for quote) with market maker participation. Routing opaque, protocols not fully disclosed. For route scraping — less useful.
Rate limits and workarounds
1inch Public API: 1 request/second, 500k requests/month free. For intensive scraping — 1inch Dev Portal with Pro plan or using own 1inch router via direct contract calls.
Direct call to 1inch Aggregation Router through eth_call — get quote without HTTP request and without rate limits. Use calldata from SDK for simulation:
const data = routerContract.interface.encodeFunctionData("swap", [
executor, desc, data
]);
const result = await provider.call({ to: ROUTER_ADDRESS, data });
But requires understanding 1inch internal format — changes between router versions.
Route parsing
/quote response contains protocols — array of arrays representing split route:
"protocols": [
[
[{"name": "UNISWAP_V3", "part": 60, "fromTokenAddress": "...", "toTokenAddress": "..."}],
[{"name": "CURVE", "part": 40, ...}]
]
]
First level — parallel paths (split by volume). Second level — sequential hops within path. For building liquidity graph: normalize protocol names, aggregate by token pairs, track part dynamics over time.
Jupiter API (Solana)
V6 Quote API
GET /quote?inputMint=...&outputMint=...&amount=...&slippageBps=50
Response includes routePlan — detailed route through Solana AMMs:
"routePlan": [
{
"swapInfo": {
"ammKey": "...",
"label": "Orca (Whirlpool)",
"inputMint": "...",
"outputMint": "...",
"inAmount": "1000000",
"outAmount": "998432",
"feeAmount": "3000",
"feeMint": "..."
},
"percent": 100
}
]
For scraping: ammKey — pool public key on Solana. Can directly request pool state through getAccountInfo. label — human-readable AMM name.
Jupiter Price API
Jupiter provides /price?ids=... endpoint — bulk price query for up to 100 tokens per request. Returns price in USDC with liquidity source info. Not quotation (no slippage), just reference price. Updates every 30 seconds.
For building price history: query /price with needed pairs every 30 seconds, save to TimescaleDB or InfluxDB. Per day — ~2880 data points per pair.
Rate limits Jupiter: public API without key — 600 requests/minute. Jupiter API Pro — higher. For production systems recommend own Jupiter self-hosted or partner key.
Scraper architecture
Data structure
interface RouteSnapshot {
timestamp: number;
chain: "ethereum" | "solana" | "arbitrum" | ...;
inputToken: string;
outputToken: string;
inputAmount: bigint;
outputAmount: bigint;
priceImpact: number; // in %
protocols: ProtocolHop[];
source: "1inch" | "jupiter";
}
interface ProtocolHop {
name: string;
poolAddress: string;
percentOfRoute: number;
inputAmount: bigint;
outputAmount: bigint;
}
Request queue and retry
Scraper with multiple token pairs → parallel requests → quick rate limit hit. Right architecture: queue with Bull/BullMQ + Redis, configurable concurrency per source.
Retry with exponential backoff on 429 Too Many Requests: delay = Math.min(base * 2^attempt, maxDelay). For 1inch — base = 1000ms, maxDelay = 30000ms.
Monitor scraper health: prometheus metrics scraper_requests_total{status="success|error"}, scraper_latency_ms. Alert on error rate > 10% for 5 minutes.
Storage and queries
TimescaleDB (PostgreSQL extension) for time series — optimized for WHERE timestamp BETWEEN ... AND ... queries with aggregation. For high-frequency route scraping — partition by days.
ClickHouse as alternative for very high volumes (>10M rows/day): columnar storage gives 10-100x faster analytical queries on large time ranges.
| Pair | 1inch chains | Jupiter pools | Frequency |
|---|---|---|---|
| USDC/ETH | Ethereum, Arbitrum, Optimism | — | 1 min |
| SOL/USDC | — | Orca, Raydium | 30 sec |
| BTC/USDC | all EVM | — | 5 min |
Work process
Analytics (0.5 day). List of pairs and order sizes to monitor, update frequency requirements, data usage goals (analytics / trading signal / research).
Development (1-3 days). Scraper service (Node.js/TypeScript) + storage + basic dashboard or API for data consumption.
Timeline estimates
Scraper for one source (1inch or Jupiter) with PostgreSQL storage — 1-2 days. Multi-source scraper with data normalization, ClickHouse and analytical API — 3-5 days.
Cost calculated individually.







