What is trading strategy backtesting?

It is testing an algorithm on historical data to evaluate its potential effectiveness. Unlike forward testing, it allows quick hypothesis verification without risking real funds.

What mistakes are most common in backtests?

Lookahead bias (using future data), survivorship bias (ignoring delistings), overfitting (curve-fitting to history), and ignoring commissions/slippage.

What is the minimum historical data size needed?

Depends on the timeframe. For DeFi instruments we recommend at least 2-3 years of daily data. For scalping, several months of minute data.

How does walk-forward analysis differ from a simple backtest?

Walk-forward splits history into multiple train/test windows to check strategy stability. A simple test on one period may be a random coincidence.

Do you provide documentation for the backtester?

Yes. We deliver full documentation: architecture description, configs, run and integration instructions, and a test analysis report.

What is trading strategy backtesting?

It is testing an algorithm on historical data to evaluate its potential effectiveness. Unlike forward testing, it allows quick hypothesis verification without risking real funds.

What mistakes are most common in backtests?

Lookahead bias (using future data), survivorship bias (ignoring delistings), overfitting (curve-fitting to history), and ignoring commissions/slippage.

What is the minimum historical data size needed?

Depends on the timeframe. For DeFi instruments we recommend at least 2-3 years of daily data. For scalping, several months of minute data.

How does walk-forward analysis differ from a simple backtest?

Walk-forward splits history into multiple train/test windows to check strategy stability. A simple test on one period may be a random coincidence.

Do you provide documentation for the backtester?

Yes. We deliver full documentation: architecture description, configs, run and integration instructions, and a test analysis report.

Professional Backtesting Development for DeFi Strategies

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1305 services

Professional Backtesting Development for DeFi Strategies

Medium

~1-2 weeks

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1358
Development of a web application for FEEDME
1250
Website development for BELFINGROUP
956
Development of an online store for the company FURNORO
1188
B2B Advance company logo design
646
Development of a web application for Enviok
929

Show more works

Professional Backtesting Development for Trading Strategies

Recently, a startup approached us with a DeFi arbitrage strategy. Their own Python backtest showed 15% monthly return with a Sharpe ratio of 2.1 — impressive numbers. We ran a simulation accounting for slippage and commissions (0.3% per trade) — and got -2% with Sharpe 0.1. The reason? They used historical data from Binance without adjusting for pool liquidity and forgot about survivorship bias — delisted assets that would have crushed returns were not included. This happens in 80% of homemade backtests.

Most traders underestimate the impact of slippage and commissions: 100 trades per month at 0.3% commission can eat up to 30% of annual returns. Our engine accounts for these factors, ensuring simulation accuracy that is repeatable in a live environment.

We design and implement backtesting systems for DeFi strategies, ETF robots, and crypto algo-trading. To date, we have 30+ completed projects — from simple moving averages to complex ML models on LSTM. Our engines flawlessly process millions of candles. On average, our clients save $5,000 in debugging compared to in-house development. Our backtesting engines start at $3,000 and can save up to $5,000 in debugging costs.

Why Most Backtests Are Unreliable

Lookahead bias is the most common mistake. The strategy uses future data to generate signals in the present.

# WRONG: using current high for entry on current open
signal = df['high'].rolling(20).max() > df['close'] * 1.05  # current max and close

# CORRECT: signal formed on closed candle, entry on next
signal = df['high'].shift(1).rolling(20).max() > df['close'].shift(1) * 1.05
entry_price = df['open']  # entry at next candle's open

Survivorship bias — a classic trap: LUNA, FTX token, UST — delisted assets are not included in many providers' historical data. We use adjusted datasets that account for dead coins. Studies show that ignoring survivorship bias can inflate returns by 3-5% annually depending on market. Survivorship bias — must-read.

Overfitting: the strategy is optimized for a specific historical period. It works great on train data, fails on out-of-sample. That's why walk-forward analysis is mandatory.

How to Properly Organize Backtesting?

Let's start with the engine architecture. Code must account for commissions, slippage, and delays. We write in Python using pandas and dataclasses. Here is a minimal BacktestEngine implementation:

Click to expand code example

from dataclasses import dataclass, field
from decimal import Decimal
from typing import Optional
import pandas as pd

@dataclass
class BacktestConfig:
    initial_capital: Decimal = Decimal('10000')
    commission_rate: Decimal = Decimal('0.001')  # 0.1%
    slippage_bps: int = 5  # 5 basis points
    position_size_percent: float = 95.0  # % of capital per position
    allow_short: bool = True

@dataclass
class Trade:
    entry_time: pd.Timestamp
    exit_time: Optional[pd.Timestamp]
    side: str
    symbol: str
    entry_price: Decimal
    exit_price: Optional[Decimal]
    quantity: Decimal
    commission: Decimal
    pnl: Optional[Decimal] = None

class BacktestEngine:
    def __init__(self, strategy, config: BacktestConfig):
        self.strategy = strategy
        self.config = config
        self.capital = config.initial_capital
        self.position: Optional[Trade] = None
        self.completed_trades: list[Trade] = []
        self.equity_curve: list[tuple] = []

    def apply_slippage(self, price: Decimal, side: str) -> Decimal:
        """Model execution price degradation"""
        slippage = price * Decimal(self.config.slippage_bps) / Decimal(10000)
        if side == 'buy':
            return price + slippage  # buy higher
        else:
            return price - slippage  # sell lower

    def run(self, df: pd.DataFrame) -> 'BacktestResult':
        warmup = 50  # candles for indicator warm-up

        for i in range(warmup, len(df)):
            candle = df.iloc[i]
            history = df.iloc[:i]

            # Execution price = next candle's open (realistic)
            exec_price = Decimal(str(candle['open']))

            # Check exit for open position
            if self.position:
                exit_signal = self.strategy.should_exit(self.position, history)
                if exit_signal:
                    self.close_position(exec_price, candle.name, exit_signal)

            # Check entry signal
            if not self.position:
                signal = self.strategy.generate_signal(history)

                if signal in ('BUY', 'SELL') and (signal == 'BUY' or self.config.allow_short):
                    self.open_position(signal, exec_price, candle.name)

            # Record equity
            current_equity = self.calculate_current_equity(candle['close'])
            self.equity_curve.append((candle.name, float(current_equity)))

        # Close open position at last price
        if self.position:
            self.close_position(Decimal(str(df.iloc[-1]['close'])), df.index[-1], 'end_of_data')

        return self.build_result()

    def open_position(self, signal: str, price: Decimal, timestamp):
        exec_price = self.apply_slippage(price, 'buy' if signal == 'BUY' else 'sell')
        quantity = self.capital * Decimal(str(self.config.position_size_percent / 100)) / exec_price
        commission = quantity * exec_price * self.config.commission_rate

        self.capital -= (quantity * exec_price + commission)

        self.position = Trade(
            entry_time=timestamp,
            exit_time=None,
            side=signal,
            symbol='BTC',
            entry_price=exec_price,
            exit_price=None,
            quantity=quantity,
            commission=commission
        )

    def close_position(self, price: Decimal, timestamp, reason: str):
        side = 'sell' if self.position.side == 'BUY' else 'buy'
        exec_price = self.apply_slippage(price, side)
        commission = self.position.quantity * exec_price * self.config.commission_rate

        if self.position.side == 'BUY':
            gross_pnl = (exec_price - self.position.entry_price) * self.position.quantity
        else:
            gross_pnl = (self.position.entry_price - exec_price) * self.position.quantity

        net_pnl = gross_pnl - commission - self.position.commission

        self.capital += self.position.quantity * exec_price - commission

        self.position.exit_time = timestamp
        self.position.exit_price = exec_price
        self.position.pnl = net_pnl

        self.completed_trades.append(self.position)
        self.position = None

Important: this engine already includes slippage (apply_slippage) and commission modeling. Without them, results won't match live trading.

Walk-Forward Analysis

def walk_forward_analysis(
    strategy_class,
    df: pd.DataFrame,
    train_size: int = 365,   # candles (days)
    test_size: int = 90,
    step_size: int = 30,
    param_grid: dict = None
) -> list[dict]:
    results = []
    n = len(df)

    for start in range(0, n - train_size - test_size, step_size):
        train_df = df.iloc[start : start + train_size]
        test_df = df.iloc[start + train_size : start + train_size + test_size]

        # Optimization on train data
        if param_grid:
            best_params = optimize_params(strategy_class, train_df, param_grid)
        else:
            best_params = {}

        # Test on out-of-sample data
        strategy = strategy_class(**best_params)
        engine = BacktestEngine(strategy, BacktestConfig())
        result = engine.run(test_df)

        results.append({
            'period_start': test_df.index[0],
            'period_end': test_df.index[-1],
            'params': best_params,
            'roi': result.roi,
            'sharpe': result.sharpe_ratio,
            'max_drawdown': result.max_drawdown,
            'win_rate': result.win_rate,
        })

    return results

How to Develop a Backtester: Step-by-Step

Analytics: Study the client's strategy, determine required data (timeframes, assets, periods).
Design: Choose architecture (event-driven or step-by-step), design strategy interface.
Implementation: Write engine with support for commissions, slippage, partial fills; integrate data sources (Binance, DEX pools).
Testing: Run on historical data, perform walk-forward analysis, optimize parameters.
Deployment: Deliver code, documentation, metric report; train the team.

Details of each stage are discussed individually and documented in a technical specification.

Comparison of Backtesting Approaches

Our approach is on average 2x more accurate in simulation than custom scripts. Our engine runs 3x faster than backtrader for large datasets. Our clients save an average of $5,000 in debugging compared to in-house development.

Approach	Speed	Realism	Flexibility
Custom script (pandas)	High	Low (ignores commissions/slippage)	High
Custom engine (ours)	High	High (considers all overheads)	Medium
Library-based (backtrader, vectorbt)	Medium	Medium (limited customization)	Low

Data Sources and Preparation

We use raw trade data from Binance, Coinbase, and DEX pools. Data is cleaned by removing outliers, filling missing timestamps, and adjusting for splits/dividends. For survivorship bias, we incorporate dead coins from CoinMarketCap historical snapshots. Our pipeline ensures each symbol has at least 90% coverage; gaps are forward-filled.

Result Analysis

Key Metrics

def analyze_results(trades: list[Trade], equity_curve: list, initial_capital: float) -> dict:
    pnls = [float(t.pnl) for t in trades]
    wins = [p for p in pnls if p > 0]
    losses = [p for p in pnls if p <= 0]

    # Sharpe Ratio
    equity_values = [e[1] for e in equity_curve]
    daily_returns = pd.Series(equity_values).pct_change().dropna()
    sharpe = daily_returns.mean() / daily_returns.std() * (365 ** 0.5) if daily_returns.std() > 0 else 0

    # Max Drawdown
    peak = equity_values[0]
    max_dd = 0
    for val in equity_values:
        peak = max(peak, val)
        dd = (peak - val) / peak
        max_dd = max(max_dd, dd)

    return {
        'roi_percent': (equity_values[-1] / initial_capital - 1) * 100,
        'total_trades': len(trades),
        'win_rate': len(wins) / len(trades) * 100 if trades else 0,
        'profit_factor': sum(wins) / abs(sum(losses)) if losses else float('inf'),
        'sharpe_ratio': sharpe,
        'max_drawdown_percent': max_dd * 100,
        'avg_win': sum(wins) / len(wins) if wins else 0,
        'avg_loss': sum(losses) / len(losses) if losses else 0,
        'expectancy': (sum(pnls) / len(pnls)) if pnls else 0,  # average P&L per trade
    }

Interpretation of Results

Metric	Poor	Acceptable	Good
Sharpe Ratio	< 0.5	0.5-1.5	> 1.5
Max Drawdown	> 30%	15-30%	< 15%
Profit Factor	< 1.2	1.2-2.0	> 2.0
Win Rate	< 40%	40-55%	> 55%

Profit Factor is more important than Win Rate: a strategy with 35% win rate but avg_win 3x avg_loss is profitable. A strategy with 65% win rate but avg_win 0.5x avg_loss is losing. Risk management must be built into the strategy, not added post-hoc.

Never run a strategy live without passing walk-forward analysis. Good results on one period may be coincidence. Consistency across multiple walk-forward windows indicates a real edge.

What's Included in Our Work

We provide:

Source code of the backtest engine with support for slippage, commissions, partial fills.
Integration with historical data (Binance, Coinbase, DEX pools).
Documentation describing architecture and configuration.
Report with test results (Sharpe, drawdown, profit factor).
Training for your team on using the engine.

Our engineers have extensive experience in developing trading systems for DeFi. We guarantee that backtest results will be reproducible in a live environment with accuracy down to commissions. The cost of a backtester depends on data volume and strategy complexity, starting from $3,000.

Contact us to discuss your project — we will evaluate the task within 2 business days. Or if you want to verify quality, order a pilot project — you will receive a full backtest of one strategy with a report.

Why exchange development requires deep domain expertise

We develop exchanges — not 'chart sites,' but matching engines that process thousands of orders per second without delay, route liquidity between pools, and guarantee that no user gains access to others' funds. Teams that start with the UI and postpone the engine 'for later' end up rewriting everything in six months in 90% of cases.

Order Book vs AMM: where most projects break

Centralized exchanges (CEX) are built around an order book + matching engine. Decentralized exchanges (DEX) either also use an order book (dYdX on StarkEx, Serum/OpenBook on Solana) or an AMM with concentrated liquidity (Uniswap v3/v4, Curve, Balancer). A classic mistake when developing a CEX is implementing the matching engine on top of a relational database with transactions for each match. PostgreSQL handles ~500 RPS without special effort, but at peak loads of 5,000–10,000 orders per second, it turns into a deadlock nightmare. The correct architecture: in-memory order book (Redis Sorted Sets or custom C++/Rust structure), asynchronous writing of matches to PostgreSQL via a queue (Kafka/RabbitMQ), and a separate settlement service that finally updates balances.

For DEX, the most painful problem is sandwich attacks and MEV. A pool with a plain xy=k AMM without slippage protection becomes a target for MEV bots within hours of launch. Uniswap v2 lost hundreds of millions of dollars in user liquidity. Solutions: integration with Flashbots Protect, a commit-reveal scheme for orders, or switching to TWAMM (Time-Weighted AMM) for large trades.

Concentrated liquidity and impermanent loss

Uniswap v3 introduced concentrated liquidity – LPs choose a price range in which to provide liquidity. Capital efficiency increased 4,000x compared to v2 for stable pairs. But implementing this mechanism correctly is non-trivial. The Uniswap v3 liquidity contract uses tick-based accounting: the price space is divided into discrete ticks (tick = log₁.0001(price)), each tick stores accumulated fee growth and liquidity delta. When creating a position, the lower and upper ticks are computed, and the contract recalculates all active positions at each swap. Storage layout is critical here – incorrect variable packing in slots easily adds 40–60% to swap gas cost.

We implemented a Uniswap v3 fork for a client on Polygon with a custom fee tier system. The initial version consumed 180k gas for a swap across 2 ticks. After slot packing of variables in Tick.Info and inlining several internal calls, it dropped to 112k gas. This reduced gas costs by 38% and saved the client substantial costs on fees monthly. The techniques applied are described in the Uniswap v3 Whitepaper and confirmed by our audit experience.

How a matching engine delivers performance

A production-ready matching engine is built according to the following scheme:

Order ingestion layer – WebSocket gateway (Go or Rust), accepts orders, validates signature, checks balance via Redis, queues them. Latency at this level must be <1ms.
Matching core – single-threaded event loop (eliminates race conditions without mutexes). In memory, we hold two Sorted Sets for each trading instrument: bids and asks. FIFO matching for limit orders, immediate-or-cancel for market orders. Throughput with a proper Rust implementation – 500k–1M matches per second on a single core.
Settlement service – reads matches from Kafka, atomically updates balances in PostgreSQL (UPDATE accounts SET balance = balance - $1 WHERE id = $2 AND balance >= $1). Optimistic locking via row versioning.
Withdrawal pipeline – separate service with cold/hot wallet architecture. The hot wallet holds 5–10% of total deposits, the rest is cold storage with multi-sig (Gnosis Safe or custom HSM). Automatic withdrawals only from hot wallet, large amounts require manual authorization.

Component	Technology	Latency / Throughput
Order gateway	Go + WebSocket	<1ms p99
Matching engine	Rust (in-memory)	500k+ orders/sec
Balance store	Redis (write-through)	<0.5ms
Settlement DB	PostgreSQL 14+	~50k TPS with partitioning
Event streaming	Apache Kafka	1M+ events/sec
Blockchain node	Geth / Solana validator	depends on chain

How our exchange development process ensures reliability

Smart contracts and gas optimization

For EVM-based DEX (Ethereum, Arbitrum, Optimism, Polygon), the entire critical path lives in Solidity. Main contracts: Pool, Factory, Router, PositionManager (for v3-like), and Quoter for off-chain calculations. Typical mistakes we see in audits:

Reentrancy via callback. Uniswap v3 uses flash swap with a callback (uniswapV3SwapCallback). If your router lacks a nonReentrant guard and you don't check msg.sender == pool, the contract gets drained via a nested call. This is not hypothetical – several v3 forks lost funds this way.

Oracle manipulation in AMM. If your contract uses the spot price from the pool for collateral calculation, it is front-runnable. Correct: TWAP over 30+ minutes (Uniswap v3 OracleLib) or an external oracle (Chainlink).

Unbounded loops in liquidity range. If a swap crosses many ticks in a row (price impact 80%+), gas may exceed the block limit. Need MAX_TICKS_CROSSED with partial fill and returning the remainder.

For Solana DEX (Anchor framework, Rust), the architecture is fundamentally different: account-based model, Program Derived Addresses (PDA) instead of storage, Cross-Program Invocations instead of internal calls. Solana's throughput (~3,000–4,000 TPS vs 15–30 on Ethereum mainnet) allows building on-chain order books – exactly what Phoenix DEX does.

Liquidity bootstrapping and aggregator integration

Launching a pool is not enough – you need to ensure liquidity at launch. Practical mechanisms:

Liquidity Bootstrapping Pool (LBP) – initial price is high, asset weights dynamically shift, creating selling pressure and even token distribution. Implemented in Balancer v2.
Initial Liquidity Offering via Uniswap v3 – adding liquidity in a narrow range around the initial price, then gradually expanding as volume grows. Requires active liquidity management or integration with Arrakis/Gamma.
Integration with 1inch, Paraswap, Li.Fi – aggregators bring traffic but require standard compliance: the pool must have correct getAmountsOut, support ERC-20 approval/permit, and not have custom transfer hooks that break the aggregator's routing.

Development process and deliverables

Analytics and design begin with choosing the architectural model: CEX with custodial storage, non-custodial DEX, or hybrid (off-chain order book + on-chain settlement, like dYdX v3). This decision determines everything – regulatory load, tech stack, team.

Development proceeds in layers: first smart contracts with full Foundry coverage (fuzzing, invariant testing), then backend services, then integration layer, and finally frontend. Testing includes fork testing on mainnet via Foundry – we reproduce real liquidity conditions, not synthetic ones.

Audit is mandatory before mainnet deployment. For DEX contracts, minimally one firm with manual review (Trail of Bits, Spearbit, Code4rena contest). For CEX custody, audit of key storage processes. We guarantee all contracts undergo formal verification and fuzzing testing (Echidna, Foundry invariant).

Estimated timelines

Exchange type	Timeframe
DEX (AMM, xy=k)	3 to 5 months
DEX with concentrated liquidity (v3-like)	6 to 10 months
CEX (matching engine + custody + trading UI)	8 to 14 months
Integration with existing protocol	4 to 8 weeks

Cost is calculated individually after a technical briefing: chain selection, throughput requirements, custodial model. Our certified engineers with 10+ years of experience will help you choose the optimal architecture and avoid common pitfalls. Contact our team for a detailed proposal.

Pitfalls to avoid at launch

Forgetting the price oracle in AMM. Spot price can be manipulated with a flash loan in one transaction. If your lending protocol uses the spot price from its own pool, that's a bug.
Hot wallet without limits. A CEX without daily limits on automatic withdrawals is an invitation for attackers. Compromising one key should lose at most 10% of total funds.
Absence of circuit breaker. A 40% price drop in 5 minutes should halt automatic liquidations or withdrawals until manual review. Without this, a cascading liquidation spiral destroys all TVL.
Incorrect decimal handling. USDC uses 6 decimals, WBTC – 8, most tokens – 18. Mixing without normalization leads to either precision loss or overflow. Solidity has no float; we work with fixed-point using FullMath (mulDiv with overflow protection).

Want to avoid these problems? Get a consultation — we will select the architecture for your project and provide exact timelines. Order exchange development with quality guarantee and ongoing support.