Scraping Exchange Listings/Delistings Data
Announcement of a new token listing on Binance historically gives 20–100% price increase in the first minutes. Delisting is symmetric decline. Speed of obtaining this information directly converts to P&L for trading systems. The task of scraping listings is about latency, not just data collection.
Data Sources on Listings
Exchanges announce listings through several channels with different delays:
| Source | Typical Delay Until Publication | Reliability |
|---|---|---|
| Official announcements page | Primary source | High |
| Official Telegram/Twitter | Minutes after page | High |
| REST API (new markets endpoint) | Often leads announcement | High |
| RSS feeds | Depends on exchange | Medium |
| CoinGecko/CoinMarketCap | Aggregator, > 30 min delay | Low |
Fastest way to learn of listing often — not official announcement, but new trading instrument appearing in API.
Monitoring API for New Instruments
Most exchanges have endpoint with list of all trading pairs. Periodic comparison of current list to previous gives delta — new listings:
import asyncio
import aiohttp
from datetime import datetime
class ListingMonitor:
def __init__(self):
self.known_symbols: dict[str, set] = {}
self.poll_interval = 30 # seconds
async def get_binance_symbols(self, session: aiohttp.ClientSession) -> set:
async with session.get(
"https://api.binance.com/api/v3/exchangeInfo",
timeout=aiohttp.ClientTimeout(total=5)
) as resp:
data = await resp.json()
return {
s['symbol']
for s in data['symbols']
if s['status'] == 'TRADING'
}
async def check_for_new_listings(self, exchange: str, session):
current = await self.get_symbols(exchange, session)
previous = self.known_symbols.get(exchange, set())
new_listings = current - previous
delistings = previous - current
if new_listings:
for symbol in new_listings:
await self.on_new_listing(exchange, symbol)
if delistings:
for symbol in delistings:
await self.on_delisting(exchange, symbol)
self.known_symbols[exchange] = current
async def on_new_listing(self, exchange: str, symbol: str):
event = {
'type': 'listing',
'exchange': exchange,
'symbol': symbol,
'detected_at': datetime.utcnow().isoformat(),
}
await self.notify(event)
Poll interval should balance detection speed and rate limits: 15–30 seconds is reasonable compromise. For Binance futures — separate endpoint /fapi/v1/exchangeInfo.
Parsing Official Announcement Pages
Exchanges publish text announcements about listings on their sites. HTML page parsing is an additional source:
from bs4 import BeautifulSoup
import re
async def scrape_binance_announcements(session: aiohttp.ClientSession) -> list:
url = "https://www.binance.com/en/support/announcement/new-cryptocurrency-listing"
headers = {
'User-Agent': 'Mozilla/5.0 (compatible; research bot)',
'Accept-Language': 'en-US,en;q=0.9',
}
async with session.get(url, headers=headers) as resp:
html = await resp.text()
soup = BeautifulSoup(html, 'html.parser')
announcements = []
for article in soup.select('a[href*="/support/announcement/"]'):
title = article.get_text(strip=True)
href = article.get('href')
# Search for ticker mentions in title
tickers = re.findall(r'\(([A-Z]{2,10})\)', title)
if tickers:
announcements.append({
'title': title,
'url': href,
'tickers': tickers,
'scraped_at': datetime.utcnow().isoformat(),
})
return announcements
Problem: Binance and Bybit actively use JavaScript rendering and anti-bot protection (Cloudflare). Playwright or Puppeteer for headless Chrome — standard solution for JS-heavy pages:
from playwright.async_api import async_playwright
async def scrape_with_playwright(url: str) -> str:
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.goto(url, wait_until='networkidle')
content = await page.content()
await browser.close()
return content
RSS and Telegram
Some exchanges (Kraken, KuCoin) publish announcements via RSS — simplest and most reliable source:
import feedparser
def parse_exchange_rss(feed_url: str) -> list:
feed = feedparser.parse(feed_url)
listings = []
for entry in feed.entries:
if any(word in entry.title.lower()
for word in ['listing', 'adds', 'new trading pair']):
listings.append({
'title': entry.title,
'link': entry.link,
'published': entry.published,
})
return listings
Telegram monitoring of official exchange channels via Telethon:
from telethon import TelegramClient, events
client = TelegramClient('session', api_id, api_hash)
@client.on(events.NewMessage(chats=['@binance', '@kucoincom']))
async def handle_announcement(event):
text = event.message.text
if 'listing' in text.lower() or 'will list' in text.lower():
tickers = re.findall(r'\$([A-Z]{2,10})', text)
await process_listing_announcement(tickers, text, event.date)
Storage and Deduplication
Listing can be detected through multiple channels simultaneously — deduplication needed:
CREATE TABLE listing_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
exchange TEXT NOT NULL,
symbol TEXT NOT NULL,
event_type TEXT NOT NULL, -- 'listing' | 'delisting' | 'suspension'
detected_at TIMESTAMPTZ NOT NULL,
source TEXT NOT NULL, -- 'api_poll' | 'announcement' | 'rss' | 'telegram'
raw_data JSONB,
UNIQUE(exchange, symbol, event_type, date_trunc('hour', detected_at))
);
Notifications on event — via webhook (Slack, Discord, Telegram bot, own endpoint). For trading systems — Kafka topic exchange.listings for downstream consumers.
Limitations and Accuracy
No single source gives 100% coverage and zero latency. Combination of API polling + RSS + Telegram gives ~95% events in first 5 minutes for top-10 exchanges. Smaller exchanges — only page parsing.
False positives: symbol appearance in API doesn't always mean trading opens — exchange may add pair in "pre-trading" mode. Need status check (status: TRADING vs PRE_DELIVERING).
Developing listing monitoring system for 10–15 exchanges with multi-channel alerting and history storage — 2–3 weeks.







