Trading Matching Engine Development

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1306 services

Trading Matching Engine Development

Complex

from 2 weeks to 3 months

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1306
Development of a web application for FEEDME
1218
Website development for BELFINGROUP
920
Development of an online store for the company FURNORO
1147
B2B Advance company logo design
610
Development of a web application for Enviok
885

Show more works

Matching Engine Development

Matching engine is the core of any exchange. It accepts orders, matches buyers with sellers, and generates trades. This is the most performant and most critical component: microsecond delays matter, pricing logic errors are catastrophic. Let's explore the architecture.

What is a Matching Engine

Functions

Order acceptance: new order is validated and directed to the correct order book
Matching: finding compatible orders, generating trades
Order management: partial fills, cancellations, expiry
Event publishing: trades, book updates → downstream systems

All this must happen sequentially (single-threaded matching core) to prevent race conditions. Parallelism is added at the different trading pairs level — each symbol has its own thread/core.

Price-Time Priority (FIFO)

Standard matching algorithm:

Price priority: order with best price executes first
Time priority: at equal price — earlier order has priority

For buys "best price" = higher (willing to pay more). For sells = lower (willing to accept less).

Data Structures

Order Book as Sorted Data Structure

use std::collections::{BTreeMap, VecDeque, HashMap};

// Price as integer: avoid float arithmetic (1.23456789 → 123456789)
// All prices in smallest units × 10^8
type Price = i64;
type Qty = u64;
type OrderId = u64;

#[derive(Clone, Debug)]
pub struct Order {
    pub id: OrderId,
    pub user_id: u64,
    pub symbol: String,
    pub side: Side,
    pub price: Price,           // 0 for market orders
    pub quantity: Qty,
    pub filled: Qty,
    pub order_type: OrderType,
    pub time_in_force: TimeInForce,
    pub created_at: u64,        // nanoseconds since epoch
}

impl Order {
    pub fn remaining(&self) -> Qty {
        self.quantity - self.filled
    }
    pub fn is_filled(&self) -> bool {
        self.filled >= self.quantity
    }
}

#[derive(Debug)]
pub struct PriceLevel {
    pub price: Price,
    pub total_qty: Qty,
    pub orders: VecDeque<OrderId>,  // FIFO
}

pub struct OrderBook {
    pub symbol: String,
    // Bids: DESC order (highest price first) → Reverse for BTreeMap
    pub bids: BTreeMap<std::cmp::Reverse<Price>, PriceLevel>,
    // Asks: ASC order (lowest price first)
    pub asks: BTreeMap<Price, PriceLevel>,
    // Fast lookup by order ID
    pub orders: HashMap<OrderId, Order>,
    // Sequence number for event ordering
    pub sequence: u64,
}

Using BTreeMap instead of HashMap for price levels: O(log n) for insert/delete, but iteration over levels = O(1) for best price. HashMap is O(1) for lookup, but no order. Combination of both is correct solution.

Integer Prices Instead of Floating Point

Floating point arithmetic is forbidden in financial calculations. 0.1 + 0.2 != 0.3 in IEEE 754.

// Wrong: float
let price: f64 = 43750.50;

// Correct: integer with fixed precision
const PRICE_PRECISION: i64 = 100_000_000;  // 10^8
let price: i64 = 4_375_050_000_000;  // 43750.50000000

fn float_to_price(f: f64) -> Price {
    (f * PRICE_PRECISION as f64).round() as Price
}

fn price_to_float(p: Price) -> f64 {
    p as f64 / PRICE_PRECISION as f64
}

Matching Algorithm

Limit Order Matching

(See detailed Rust implementation in previous sections)

The process:

Match against opposite side
Check price crossing
Execute at matched price level (FIFO)
Update filled quantities
Remove filled orders from book
Add remaining to book if not fully filled

Market Order

Executes immediately at best available price. Has no price limit.

Stop Orders

Stop orders are triggers, not active orders in the book. When stop price is reached, a real order is created.

Event Sourcing and Persistence

Event Journal

Matching engine doesn't write directly to database — this would kill performance. Instead: all events written to append-only log (Kafka), downstream consumers asynchronously update database.

Matching Engine → Kafka (trades, order_updates) → DB Writer
                                                  → Balance Updater
                                                  → Market Data Publisher
                                                  → Notification Service

Kafka guarantees ordering within partition (partition by symbol → all events for one symbol strictly ordered).

Snapshot + Replay

On restart matching engine restores state from last snapshot + replay events:

Snapshot created every N minutes or M events. Without snapshots with large event count, replay can take hours.

Performance and Optimizations

Profiling Bottlenecks

Typical matching engine bottlenecks in priority order:

Memory allocation: each new order = heap allocation. Solution: object pool / arena allocator
Serialization: packing events into Kafka. Solution: FlatBuffers / Cap'n Proto instead of JSON
Locking: if multiple threads, any mutex kills throughput. Solution: single-threaded per-symbol + lock-free queues
Cache misses: suboptimal memory layout. Solution: cache-friendly structures (SoA instead of AoS)

LMAX Disruptor Pattern

LMAX Disruptor is lock-free ring buffer for inter-thread communication, developed for high-frequency trading.

Ring buffer on 2^N elements, wrapped sequence number. Producers and consumers use CAS (compare-and-swap) operations instead of mutex. Latency: ~1-2 nanoseconds vs ~100ns for mutex.

Latency Benchmarks

Realistic latency figures for different implementations:

Implementation	P50	P99	P99.9
Python (asyncio)	2ms	15ms	100ms
Go (stdlib)	200µs	2ms	10ms
Java (Disruptor)	50µs	500µs	2ms
Rust (custom)	10µs	100µs	500µs
C++ (HFT grade)	1-5µs	20µs	100µs

P99.9 (worst 0.1% of requests) is particularly important for reputation — that's where trader complaints go.

Testing

Matching engine is covered with unit tests on edge cases:

Partial fill with remainder
FOK with insufficient liquidity
IOC with partial execution
Simultaneous cancellation and fill (race condition)
Stop order triggers at moment of placement
Decimal overflow at extreme values

Property-based testing (fuzzing) — random order sequences generated, invariant checked: total buy volume = total sell volume, balances reconcile.

Matching engine is component written once and maintained for years. Investments in code quality, tests, and documentation pay off many times over.