When to choose TimescaleDB and when ClickHouse?

TimescaleDB is suitable if you need ACID transactions, frequent updates, and JOINs with relational data (e.g., wallet balances). ClickHouse wins for analytics over large date ranges (thousands of billions of rows) with aggregations—GROUP BY is 10-100x faster. We often combine both: TimescaleDB for hot operational data, ClickHouse as an analytical warehouse.

How to speed up queries on old data in TimescaleDB?

Use continuous aggregates—materialized views that update incrementally every 60 seconds. Old chunks are compressed with a 10–20x ratio without losing filtering capability by time or symbol. Set a compression policy for data older than 7 days, saving up to 80% storage.

What is the optimal insert strategy in ClickHouse for data parsing?

ClickHouse is optimized for batches of 10,000–100,000 rows. Accumulate data in a buffer and flush 1–2 times per second. Use ReplacingMergeTree for deduplication on repeated inserts. For aggregates—SummingMergeTree. Specify ORDER BY by filter fields (symbol, time)—this is both the primary key and storage order.

How to ensure fault tolerance of the storage system?

For TimescaleDB use PostgreSQL streaming replication. ClickHouse supports multi-shard clusters with table-level replication (ReplicatedMergeTree). Set up restore points and retention policies. We include monitoring via Grafana: metrics on table size, number of partitions, insert latency. Typical setup handles 5000 wallets every 10 seconds—43M rows/day.

What is included in a typical implementation project?

Our work includes: audit of current data and query patterns, schema design (hypertable or MergeTree), configuration of partitioning/compression, migration of existing data, writing ETL scripts, integration with Grafana + alerting, documentation, team training. Timeline: 1–3 weeks depending on volume. Average savings: $5,000/month per client.

When to choose TimescaleDB and when ClickHouse?

TimescaleDB is suitable if you need ACID transactions, frequent updates, and JOINs with relational data (e.g., wallet balances). ClickHouse wins for analytics over large date ranges (thousands of billions of rows) with aggregations—GROUP BY is 10-100x faster. We often combine both: TimescaleDB for hot operational data, ClickHouse as an analytical warehouse.

How to speed up queries on old data in TimescaleDB?

Use continuous aggregates—materialized views that update incrementally every 60 seconds. Old chunks are compressed with a 10–20x ratio without losing filtering capability by time or symbol. Set a compression policy for data older than 7 days, saving up to 80% storage.

What is the optimal insert strategy in ClickHouse for data parsing?

ClickHouse is optimized for batches of 10,000–100,000 rows. Accumulate data in a buffer and flush 1–2 times per second. Use ReplacingMergeTree for deduplication on repeated inserts. For aggregates—SummingMergeTree. Specify ORDER BY by filter fields (symbol, time)—this is both the primary key and storage order.

How to ensure fault tolerance of the storage system?

For TimescaleDB use PostgreSQL streaming replication. ClickHouse supports multi-shard clusters with table-level replication (ReplicatedMergeTree). Set up restore points and retention policies. We include monitoring via Grafana: metrics on table size, number of partitions, insert latency. Typical setup handles 5000 wallets every 10 seconds—43M rows/day.

What is included in a typical implementation project?

Our work includes: audit of current data and query patterns, schema design (hypertable or MergeTree), configuration of partitioning/compression, migration of existing data, writing ETL scripts, integration with Grafana + alerting, documentation, team training. Timeline: 1–3 weeks depending on volume. Average savings: $5,000/month per client.

Efficient Crypto Time-Series Storage: TimescaleDB vs ClickHouse

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1305 services

Efficient Crypto Time-Series Storage: TimescaleDB vs ClickHouse

Medium

~3-5 days

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1360
Development of a web application for FEEDME
1251
Website development for BELFINGROUP
957
Development of an online store for the company FURNORO
1188
B2B Advance company logo design
646
Development of a web application for Enviok
929

Show more works

Raw data from blockchains or exchanges—crypto data—accumulates quickly—tens of gigabytes per day for actively parsed sources. A typical data parsing project: parsing 5000 wallets every 10 seconds results in 43 million rows per day. After six months, the volume reaches 7.8 billion rows. Storing this in a plain PostgreSQL in a single table leads to query degradation within a few months. We specialize in storage system design for crypto time-series data on TimescaleDB or ClickHouse tailored to specific query patterns and volumes. The choice between these databases is pragmatic, not religious. Storage cost after compression drops by 80%, saving up to $3,000 per terabyte per month. Queries speed up 50x on average. Our clients often face volume growth after three months of operation: queries start taking tens of seconds, storage cost rises. We offer an architecture that scales linearly—add new nodes without downtime. Trusted by 100+ clients, 5+ years in data engineering, 50+ petabytes managed. Get a consultation on choosing the right DBMS for your data.

Choosing Between TimescaleDB and ClickHouse

TimescaleDB is a PostgreSQL extension. It adds hypertables (automatic time-based partitioning), continuous aggregates (incremental materialized views), compression with 10-20x ratio. You stay in the PostgreSQL ecosystem: standard SQL, ACID transactions, JOINs with regular tables, familiar tooling.

ClickHouse is a columnar OLAP database. Data is stored by columns, providing a huge advantage in aggregations over a subset of columns. Speed of GROUP BY and SUM on billions of rows is 10-100 times higher than PostgreSQL. Weak points: no transactions, UPDATE/DELETE are expensive operations, JOIN works differently.

Criteria	TimescaleDB	ClickHouse
Query pattern	Complex JOINs, OLTP+OLAP mix	Analytics, aggregations over large ranges
Write	INSERT in transactions, UPSERT	Batch insert, eventual deduplication
Point read	Fast (B-tree indexes)	Slower (no efficient point reads)
Analytics	Good	Much faster (10-100x)
Updates	Standard UPDATE	Expensive (ReplacingMergeTree)
Operational complexity	Moderate	Higher
Data volume	Up to ~1TB effectively	Effective from 100GB+

Recommendation for parsing on-chain data:

TimescaleDB — if data is needed for product logic (balances, positions, accounts), has JOINs with relational data, needs ACID guarantees.
ClickHouse — if it's an analytical pipeline (trading signals, aggregated statistics, historical analysis), queries work with large date ranges.

In production, we often combine: TimescaleDB for hot/operational data + ClickHouse for analytical warehouse. This combination gives up to 90% savings on cold data storage. Get a consultation—we will help select the optimal DBMS.

How to Set Up Compression in TimescaleDB

Basic concept: a regular PostgreSQL table is transformed into a hypertable—under the hood, chunks (partitions) are created along the time dimension. Each chunk is a separate file; old chunks can be compressed or archived.

CREATE TABLE trades (
  time        TIMESTAMPTZ NOT NULL,
  exchange    TEXT NOT NULL,
  symbol      TEXT NOT NULL,
  price       NUMERIC(20, 8) NOT NULL,
  volume      NUMERIC(20, 8) NOT NULL,
  side        CHAR(4) NOT NULL
);
SELECT create_hypertable('trades', 'time', chunk_time_interval => INTERVAL '1 day');
CREATE INDEX ON trades (symbol, time DESC);

Continuous aggregates replace expensive realtime GROUP BY with incremental materialized views. Now the query SELECT * FROM trades_1m WHERE bucket > NOW() - INTERVAL '1 day' is a SELECT from the materialized view, not an aggregation over raw data.

Old data is compressed with almost no loss of functionality (except UPDATE/DELETE):

ALTER TABLE trades SET (
  timescaledb.compress,
  timescaledb.compress_orderby = 'time DESC',
  timescaledb.compress_segmentby = 'symbol'
);
SELECT add_compression_policy('trades', INTERVAL '7 days');

Typical compression ratio for exchange data: 10–20x. 100GB raw -> 5–10GB compressed. Storage savings reach 80% for data older than a month. According to TimescaleDB documentation, compression can reduce storage footprint dramatically while keeping data queryable.

ClickHouse Architecture

Choosing the engine is critical. For data parsing, we most often use MergeTree, ReplacingMergeTree (deduplication), and SummingMergeTree (aggregates).

CREATE TABLE trades (
  time      DateTime64(3),
  exchange  LowCardinality(String),
  symbol    LowCardinality(String),
  price     Decimal(20, 8),
  volume    Decimal(20, 8),
  side      Enum8('buy' = 1, 'sell' = 2)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(time)
ORDER BY (symbol, exchange, time);

ORDER BY in ClickHouse is both the primary key (sparse index) and the physical storage order. Choose based on query patterns: if you usually filter by (symbol, time), use that ORDER BY.

ClickHouse materialized views are trigger-based, updating on insert (not on a schedule like TimescaleDB). A unique feature is ASOF JOIN for joining by the nearest time value.

Data types. Use LowCardinality(String) for fields with low cardinality (exchange, symbol, side)—saves 2–10x in size and speeds up filtering. Use Decimal instead of Float for financial values—no precision issues.

Partitioning. By month (toYYYYMM) is standard for most financial data. Allows dropping old partitions without DELETE.

Parameter	TimescaleDB	ClickHouse
Field types	Standard PostgreSQL	LowCardinality, Decimal, Enum
Indexing	B-tree on symbol+time	ORDER BY (sparse index)
Compression	10-20x (compression policy)	5-10x (LZ4, ZSTD)
Partitioning	By day (chunk_interval)	By month (toYYYYMM)

Why Combine TimescaleDB and ClickHouse?

Storing all data in one DBMS is a compromise. TimescaleDB handles point queries and OLTP loads well, but lags in analytics with 100+ billion rows. ClickHouse, on the other hand, is inefficient for frequent updates and transactions. By combining them, you get: operational data on TimescaleDB (hot tier 30 days) and analytical layer on ClickHouse (history for all time). Infrastructure costs drop by 40% due to load distribution. Storage architecture becomes modular and scalable. Get a consultation on choosing the right DBMS for your data.

Deliverables: What's Included

Audit of current data and typical queries for data analytics
Schema design (hypertable / MergeTree) with partitioning, indexes, and compression choices
Migration scripts with integrity control
Configuration of continuous aggregates or materialized views
Integration with Grafana: dashboards for table size, number of parts, query execution time
Operations documentation and recommendations for further scaling
Training of the client's team

Process

Analytics — collect load metrics, volumes, query frequency. Identify hot and cold data.
Design — choose the DBMS, schema, and retention/compression policies.
Implementation — deploy the cluster, write the ETL pipeline.
Testing — load testing on volumes close to real.
Deployment — data migration, monitoring setup, documentation handover.

Timeline and Cost

Timelines: from 1 week for schema design to 3 weeks when migrating existing data. Cost is calculated individually based on your volume and complexity. Typical savings: $5,000/month per client. Contact us to discuss details.

Key Details

Typical compression ratio: 10-20x on TimescaleDB, 5-10x on ClickHouse.
Continuous aggregates update every 60 seconds for real-time summaries.
ClickHouse batch insert optimal size: 10,000-100,000 rows per batch.
Multi-node cluster with 3+ nodes for high availability.

Blockchain Infrastructure Deployment: Nodes, RPC, Indexing

Subgraph fell at 3:47 AM. By morning users saw outdated balances, transactions "hung" in the UI, support received 47 tickets in an hour. Cause: the handler in the subgraph failed on a transaction with a non-standard event log — and the entire index stopped. We have encountered such situations dozens of times. Our experience shows: blockchain infrastructure does not forgive gaps in observability. Guaranteeing uptime without multi-layered monitoring and fault-tolerant architecture is impossible. Over 8 years working with Ethereum, Polygon, and Solana, we have developed an approach that allows predictable deployment of infrastructure of any scale — from a single node to a multichain grid with dozens of subgraphs.

RPC Layer Architecture

Every dApp interaction with the blockchain goes through RPC — the JSON-RPC API provided by a node. Three options:

Managed providers — Alchemy, QuickNode, Infura, Ankr. Minimal operational costs, SLA, built-in monitoring. Limits: rate limits (Alchemy Free: 300 RU/sec), vendor lock, potential downtime during provider incidents. For most projects — the right choice at the start.

Self-owned nodes — full control, no rate limits, no third-party dependence. Cost: archive Ethereum node requires 2.5–3TB SSD, a strong server, and DevOps support. Sync from scratch on Ethereum via Geth/Nethermind — 3–7 days. Justified under high load or latency requirements.

Hybrid — self-owned node as primary, managed provider as fallback. Standard for protocols with high TVL. Proper load balancing can reduce costs by 20–30% compared to pure managed setup. Under high monthly request volume, hybrid saves significantly.

Provider	Strength	Limitation
Alchemy	Supernode, Enhanced APIs, webhooks	Expensive on high-volume
QuickNode	Low latency, multi-chain	More expensive than Alchemy on basic plan
Infura	Historical reliability	Rate limits on free, one major incident halted half of DeFi
Ankr	Cheap, 40+ chains	Less stable

How to Set Up an RPC Layer Without a Single Point of Failure?

At least two providers, DNS round-robin with health check every 5 seconds, automatic fallback when latency >500 ms. In practice, this gives 99.99% availability during any provider failure. For protocols with high TVL, we recommend a custom HA-proxy (nginx or Envoy) in front of two managed providers.

Why Is a Hybrid RPC Scheme More Cost-Effective Than Pure Managed?

At high request volumes, managed providers can be very expensive; a hybrid using a self-owned node as primary and a managed fallback cuts costs significantly without losing SLA.

Ethereum Node Clients

Execution clients: Geth (most used), Nethermind (C#, fast sync), Besu (Java, enterprise), Erigon (fastest sync, efficient archive mode ~2TB instead of 3TB).

Consensus clients (post-Merge): Lighthouse (Rust), Prysm (Go), Teku (Java), Nimbus (Nim). Each node after The Merge requires a pair of execution + consensus clients.

For DevOps: eth-docker — Docker Compose configurations for all client combinations. Setting up monitoring via Grafana + Prometheus is mandatory; a standard dashboard is available in each client's repository.

The Graph: Event Indexing

The Graph Protocol — decentralized indexing. A subgraph describes which events from which contracts to index and how to transform them into a GraphQL schema.

Subgraph structure:

subgraph.yaml — manifest: contract addresses, startBlock, events to handle
schema.graphql — GraphQL schema of entities
src/mapping.ts — AssemblyScript event handlers

dataSources:
  - kind: ethereum
    name: UniswapV3Pool
    network: mainnet
    source:
      address: "0x88e6A0c2dDD26FEEb64F039a2c41296FcB3f5640"
      abi: UniswapV3Pool
      startBlock: 12370624
    mapping:
      eventHandlers:
        - event: Swap(indexed address,indexed address,int256,int256,uint160,uint128,int24)
          handler: handleSwap

AssemblyScript handlers — not TypeScript. No nullable types, no closures, no many standard APIs. An error in the handler stops the subgraph indexing on that transaction. Important: add try-catch for operations that can fail (e.g., store.get() for an entity that may not exist).

How to Avoid Subgraph Indexing Stops?

Graph Node logs are monitored in real-time; on hasIndexingErrors = true an alert fires and an automatic node restart (via systemd or Kubernetes). Typical downtime on error — 150–300 seconds to recover. Additionally, for production we set up a watchdog that restarts Graph Node if subgraph lag exceeds 50 blocks.

Choosing Between Hosted Service and Decentralized Network

Graph Hosted Service (free, centralized) is deprecated in favor of Subgraph Studio + Graph Network. For production: deploy on Graph Network with GRT curation signal — the subgraph gets indexers proportional to curation.

Alternatives to The Graph: Ponder (TypeScript, self-hosted, easier to debug), Envio (ultra-fast indexer, supports EVM + non-EVM), Subsquid (TypeScript, own network), Moralis Streams (managed, webhook-based). Our experience shows: for high-load projects with unique logic, Ponder or Envio are more effective — they give full control over the process and do not require GRT tokenomics.

Webhooks and Real-Time Notifications

Alchemy Webhooks and QuickNode Streams allow receiving events in real-time via HTTP webhook or WebSocket. For monitoring addresses, new transactions, mints — this is faster than polling RPC.

Tenderly — platform for monitoring and alerts. You can set up an alert for a specific contract event, balance change, function call with certain parameters. Transaction simulation via Tenderly API is invaluable for debugging.

Monitoring and Observability

Minimum monitoring stack for a protocol:

On-chain: OpenZeppelin Defender Sentinel — watches contract events, triggers webhook or Autotask when conditions are met. Forta Network — community-maintained bots detect anomalies (large withdrawals, flash loans, governance attacks).

Infrastructure: Grafana + Prometheus for nodes, Datadog or Grafana Cloud for managed metrics. Alerts on: node is 10+ blocks behind, RPC latency >500ms, subgraph lag >100 blocks.

Uptime: Better Uptime or PagerDuty on RPC endpoint and subgraph health endpoint (The Graph provides _meta { hasIndexingErrors, block { number } }).

Why Is Monitoring Without Tenderly Insufficient?

Tenderly provides transaction simulation and detailed traces — critical for debugging subgraph and smart contract errors. Forta focuses on network anomalies, not your infrastructure. The combination of Tenderly plus a custom Grafana dashboard covers 90% of incident scenarios.

Multichain Infrastructure

A protocol on 5 chains = 5 separate RPC endpoints, 5 subgraphs, 5 monitoring configs. Manageable but requires deployment automation.

For subgraph multi-network deployment: graph deploy --network mainnet, graph deploy --network arbitrum-one etc. with a unified codebase and network-specific addresses in separate config files.

Chainlink CCIP and LayerZero for cross-chain messaging require monitoring of both chains and transactions on intermediate relayers. A reorg on the source chain after a confirmed mint on the target chain is a classic bridge problem. Solution: wait for finality (on Ethereum ~15 minutes after Merge for economic finality) before confirming on the target chain.

Infrastructure Setup Process

Audit current stack — determine chains, request volume, latency and availability requirements.
Architecture design — select providers, load balancing, redundancy.
Subgraph development — manifest → schema → handlers → testing on local Graph Node → deploy to testnet → mainnet.
Monitoring configuration — Tenderly alerts, Grafana dashboard, PagerDuty integration.
Documentation and runbook — what to do when: subgraph falls behind, RPC downtime, node desync.
Handover to operations — team training, access transfer, first month support.

What's Included

Deployment of managed or self-hosted Ethereum, Polygon, BNB Chain nodes
RPC layer setup with primary/fallback and load balancing
Subgraph development and deployment for your protocol
Monitoring connection (Tenderly, Grafana, alerts)
Runbook and operations documentation
Team training (up to 4 hours online)
30-day support after delivery

Timeline

Task	Duration
RPC and basic monitoring setup	1–2 weeks
Subgraph for one protocol	2–4 weeks
Self-hosted node with monitoring	2–3 weeks
Full infrastructure (multi-chain, monitoring, runbooks)	6–10 weeks

All projects are managed in a GitHub/GitLab repository with CI/CD; configuration code stays with you. Order infrastructure deployment — we'll show how to cut costs by 20–30% without losing reliability. Get a consultation — we'll demonstrate how we deployed infrastructure for a protocol with large TVL on Ethereum and Arbitrum. Contact us.