Parsing Long/Short Ratio Data from Exchanges
Long/Short ratio — the ratio of number of traders or volume of positions opened long vs short on perpetual futures. One of key indicators for market sentiment analysis: historically extremely high long ratio coincided with local tops, overloaded short ratio — with short squeeze. Problem: each exchange publishes this data in its own format, at different frequency, and not all provide historical data via official API.
Official APIs: Where Data Already Exists
Should start with official endpoints — they are stable and don't require browser automation.
Binance Futures — most complete data, several endpoints:
# Top trader long/short account ratio
GET https://fapi.binance.com/futures/data/topLongShortAccountRatio?symbol=BTCUSDT&period=5m&limit=30
# Top trader long/short position ratio
GET https://fapi.binance.com/futures/data/topLongShortPositionRatio?symbol=BTCUSDT&period=1h&limit=30
# All accounts ratio (retail sentiment)
GET https://fapi.binance.com/futures/data/globalLongShortAccountRatio?symbol=BTCUSDT&period=1h&limit=30
Response: array [{symbol, longShortRatio, longAccount, shortAccount, timestamp}]. Historical data limited: limit=500 maximum, period from 5m to 1d. Data older than ~30 days unavailable via API — need to collect yourself.
Bybit — endpoint /v5/market/account-ratio:
GET https://api.bybit.com/v5/market/account-ratio?category=linear&symbol=BTCUSDT&period=1h&limit=50
OKX — /api/v5/rubik/stat/contracts/long-short-account-ratio:
GET https://www.okx.com/api/v5/rubik/stat/contracts/long-short-account-ratio?ccy=BTC&period=1H
OKX doesn't require authentication for public market data endpoints. Rate limit: 20 req/2 sec.
CoinGlass API — aggregates L/S data from Binance, OKX, Bybit, Bitget in one request. Paid (from $29/month), but greatly simplifies working with multiple exchanges.
Collection and Storage
Data should be collected regularly — exchanges store history limitedly, so own database necessary for analyzing long periods.
import httpx
import asyncio
from datetime import datetime
import asyncpg
ENDPOINTS = {
"binance_top_account": "https://fapi.binance.com/futures/data/topLongShortAccountRatio",
"binance_global": "https://fapi.binance.com/futures/data/globalLongShortAccountRatio",
"bybit": "https://api.bybit.com/v5/market/account-ratio",
}
async def collect_ls_ratio(symbol: str, period: str, db: asyncpg.Connection):
async with httpx.AsyncClient() as client:
resp = await client.get(
ENDPOINTS["binance_global"],
params={"symbol": symbol, "period": period, "limit": 1},
timeout=10.0,
)
data = resp.json()[0]
await db.execute("""
INSERT INTO ls_ratio (exchange, symbol, period, long_ratio, short_ratio, ts)
VALUES ($1, $2, $3, $4, $5, $6)
ON CONFLICT (exchange, symbol, period, ts) DO NOTHING
""", "binance", symbol, period, float(data["longAccount"]),
float(data["shortAccount"]),
datetime.fromtimestamp(data["timestamp"] / 1000))
ON CONFLICT DO NOTHING — protection from duplicates on re-collection. Unique index by (exchange, symbol, period, ts).
For storing time-series data TimescaleDB optimal (PostgreSQL extension): automatic time partitioning, efficient range queries, continuous aggregates for downsampling.
Parsing Data Unavailable via API
Some exchanges (Gate.io, Bitfinex) don't publish L/S ratio via official API, but show it on web page. For such cases — headless browser parsing via Playwright:
from playwright.async_api import async_playwright
async def scrape_gateio_ls(symbol: str) -> float:
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
# Intercept XHR requests to internal API
ls_data = {}
page.on("response", lambda r: capture_ls_response(r, ls_data))
await page.goto(f"https://www.gate.io/futures/{symbol}")
await page.wait_for_timeout(3000)
await browser.close()
return ls_data.get("longShortRatio")
But browser parsing is unstable: markup changes, anti-bot measures appear (Cloudflare, PerimeterX). For production use only as fallback, with collection success monitoring.
Practical Remarks
Rate limiting: when collecting data for 20+ symbols from multiple exchanges easy to get HTTP 429. Use asyncio.Semaphore to limit concurrent requests and exponential backoff on errors. Binance Futures: 1200 weight per minute, each request = 1 weight for market data.
Data normalization: Binance returns longAccount as share (0.65 = 65% long), OKX — as ratio (1.86 = 1.86:1 long/short). Normalize to single format before writing to DB.
Timezones: convert all timestamps to UTC. Binance returns Unix milliseconds, Bybit too, OKX — ISO 8601 string.







