Trading Matching Engine Development

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Trading Matching Engine Development
Complex
from 2 weeks to 3 months
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1214
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

Matching Engine Development

Matching engine is the core of any exchange. It accepts orders, matches buyers with sellers, and generates trades. This is the most performant and most critical component: microsecond delays matter, pricing logic errors are catastrophic. Let's explore the architecture.

What is a Matching Engine

Functions

  1. Order acceptance: new order is validated and directed to the correct order book
  2. Matching: finding compatible orders, generating trades
  3. Order management: partial fills, cancellations, expiry
  4. Event publishing: trades, book updates → downstream systems

All this must happen sequentially (single-threaded matching core) to prevent race conditions. Parallelism is added at the different trading pairs level — each symbol has its own thread/core.

Price-Time Priority (FIFO)

Standard matching algorithm:

  1. Price priority: order with best price executes first
  2. Time priority: at equal price — earlier order has priority

For buys "best price" = higher (willing to pay more). For sells = lower (willing to accept less).

Data Structures

Order Book as Sorted Data Structure

use std::collections::{BTreeMap, VecDeque, HashMap};

// Price as integer: avoid float arithmetic (1.23456789 → 123456789)
// All prices in smallest units × 10^8
type Price = i64;
type Qty = u64;
type OrderId = u64;

#[derive(Clone, Debug)]
pub struct Order {
    pub id: OrderId,
    pub user_id: u64,
    pub symbol: String,
    pub side: Side,
    pub price: Price,           // 0 for market orders
    pub quantity: Qty,
    pub filled: Qty,
    pub order_type: OrderType,
    pub time_in_force: TimeInForce,
    pub created_at: u64,        // nanoseconds since epoch
}

impl Order {
    pub fn remaining(&self) -> Qty {
        self.quantity - self.filled
    }
    pub fn is_filled(&self) -> bool {
        self.filled >= self.quantity
    }
}

#[derive(Debug)]
pub struct PriceLevel {
    pub price: Price,
    pub total_qty: Qty,
    pub orders: VecDeque<OrderId>,  // FIFO
}

pub struct OrderBook {
    pub symbol: String,
    // Bids: DESC order (highest price first) → Reverse for BTreeMap
    pub bids: BTreeMap<std::cmp::Reverse<Price>, PriceLevel>,
    // Asks: ASC order (lowest price first)
    pub asks: BTreeMap<Price, PriceLevel>,
    // Fast lookup by order ID
    pub orders: HashMap<OrderId, Order>,
    // Sequence number for event ordering
    pub sequence: u64,
}

Using BTreeMap instead of HashMap for price levels: O(log n) for insert/delete, but iteration over levels = O(1) for best price. HashMap is O(1) for lookup, but no order. Combination of both is correct solution.

Integer Prices Instead of Floating Point

Floating point arithmetic is forbidden in financial calculations. 0.1 + 0.2 != 0.3 in IEEE 754.

// Wrong: float
let price: f64 = 43750.50;

// Correct: integer with fixed precision
const PRICE_PRECISION: i64 = 100_000_000;  // 10^8
let price: i64 = 4_375_050_000_000;  // 43750.50000000

fn float_to_price(f: f64) -> Price {
    (f * PRICE_PRECISION as f64).round() as Price
}

fn price_to_float(p: Price) -> f64 {
    p as f64 / PRICE_PRECISION as f64
}

Matching Algorithm

Limit Order Matching

(See detailed Rust implementation in previous sections)

The process:

  1. Match against opposite side
  2. Check price crossing
  3. Execute at matched price level (FIFO)
  4. Update filled quantities
  5. Remove filled orders from book
  6. Add remaining to book if not fully filled

Market Order

Executes immediately at best available price. Has no price limit.

Stop Orders

Stop orders are triggers, not active orders in the book. When stop price is reached, a real order is created.

Event Sourcing and Persistence

Event Journal

Matching engine doesn't write directly to database — this would kill performance. Instead: all events written to append-only log (Kafka), downstream consumers asynchronously update database.

Matching Engine → Kafka (trades, order_updates) → DB Writer
                                                  → Balance Updater
                                                  → Market Data Publisher
                                                  → Notification Service

Kafka guarantees ordering within partition (partition by symbol → all events for one symbol strictly ordered).

Snapshot + Replay

On restart matching engine restores state from last snapshot + replay events:

Snapshot created every N minutes or M events. Without snapshots with large event count, replay can take hours.

Performance and Optimizations

Profiling Bottlenecks

Typical matching engine bottlenecks in priority order:

  1. Memory allocation: each new order = heap allocation. Solution: object pool / arena allocator
  2. Serialization: packing events into Kafka. Solution: FlatBuffers / Cap'n Proto instead of JSON
  3. Locking: if multiple threads, any mutex kills throughput. Solution: single-threaded per-symbol + lock-free queues
  4. Cache misses: suboptimal memory layout. Solution: cache-friendly structures (SoA instead of AoS)

LMAX Disruptor Pattern

LMAX Disruptor is lock-free ring buffer for inter-thread communication, developed for high-frequency trading.

Ring buffer on 2^N elements, wrapped sequence number. Producers and consumers use CAS (compare-and-swap) operations instead of mutex. Latency: ~1-2 nanoseconds vs ~100ns for mutex.

Latency Benchmarks

Realistic latency figures for different implementations:

Implementation P50 P99 P99.9
Python (asyncio) 2ms 15ms 100ms
Go (stdlib) 200µs 2ms 10ms
Java (Disruptor) 50µs 500µs 2ms
Rust (custom) 10µs 100µs 500µs
C++ (HFT grade) 1-5µs 20µs 100µs

P99.9 (worst 0.1% of requests) is particularly important for reputation — that's where trader complaints go.

Testing

Matching engine is covered with unit tests on edge cases:

  • Partial fill with remainder
  • FOK with insufficient liquidity
  • IOC with partial execution
  • Simultaneous cancellation and fill (race condition)
  • Stop order triggers at moment of placement
  • Decimal overflow at extreme values

Property-based testing (fuzzing) — random order sequences generated, invariant checked: total buy volume = total sell volume, balances reconcile.

Matching engine is component written once and maintained for years. Investments in code quality, tests, and documentation pay off many times over.