Which ML models are best for trading?

In practice, gradient boosting (LightGBM, XGBoost) performs best on structured data – it's fast, interpretable, and robust to noise. For sequences like tick data or order book, LSTM may be useful but requires much more data and compute resources.

How to avoid lookahead bias when developing an ML strategy?

Lookahead bias occurs when features inadvertently use future data. The solution is to always shift features one period back so that only past data is used at signal calculation time. Also strictly separate train, validation, and test by time using walk-forward validation.

How long does it take to develop an ML bot from scratch?

Timeline depends on complexity: 4 to 8 weeks for the first release. Includes data analysis, feature engineering, model training, exchange integration, and demo testing. Complex strategies with on-chain data or multiple instruments may take longer.

Do you provide a guarantee on the ML strategy's performance?

We guarantee technical implementation: code runs without errors, exchange integration is stable, and backtests are reproducible. However, markets are unpredictable, so strategy profitability is not guaranteed. We provide historical testing reports and recommend starting with minimal risk.

Can the ML bot be integrated with any crypto exchange?

Yes, we support major exchanges via REST and WebSocket API: Binance, Bybit, OKX, Kraken, Coinbase, as well as decentralized protocols through DEX aggregators. For other exchanges, an open API is enough – integration takes 1 to 3 days.

Which ML models are best for trading?

In practice, gradient boosting (LightGBM, XGBoost) performs best on structured data – it's fast, interpretable, and robust to noise. For sequences like tick data or order book, LSTM may be useful but requires much more data and compute resources.

How to avoid lookahead bias when developing an ML strategy?

Lookahead bias occurs when features inadvertently use future data. The solution is to always shift features one period back so that only past data is used at signal calculation time. Also strictly separate train, validation, and test by time using walk-forward validation.

How long does it take to develop an ML bot from scratch?

Timeline depends on complexity: 4 to 8 weeks for the first release. Includes data analysis, feature engineering, model training, exchange integration, and demo testing. Complex strategies with on-chain data or multiple instruments may take longer.

Do you provide a guarantee on the ML strategy's performance?

We guarantee technical implementation: code runs without errors, exchange integration is stable, and backtests are reproducible. However, markets are unpredictable, so strategy profitability is not guaranteed. We provide historical testing reports and recommend starting with minimal risk.

Can the ML bot be integrated with any crypto exchange?

Yes, we support major exchanges via REST and WebSocket API: Binance, Bybit, OKX, Kraken, Coinbase, as well as decentralized protocols through DEX aggregators. For other exchanges, an open API is enough – integration takes 1 to 3 days.

ML Bot for Crypto Trading: Strategy Development with LightGBM and LSTM

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1305 services

ML Bot for Crypto Trading: Strategy Development with LightGBM and LSTM

Complex

from 2 weeks to 3 months

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1357
Development of a web application for FEEDME
1250
Website development for BELFINGROUP
956
Development of an online store for the company FURNORO
1188
B2B Advance company logo design
646
Development of a web application for Enviok
929

Show more works

ML/AI Trading Bot Development

You built a complex ML model for crypto trading. The backtest looks fantastic: 200% annual returns with a Sharpe ratio of 3. You deploy it on a live account—a week later, you lose 30% of your capital. Sound familiar? The cause is lookahead bias, overfitting, and unaccounted fees. Over the years, we have developed 15+ ML strategies and learned to avoid these pitfalls. The average annual return of our strategies in the portfolio is 45% after fees. We guarantee technical implementation and provide backtest reports.

Unlike static strategies, ML bots detect non-linear patterns: from order book imbalance to on-chain metrics. But without proper methodology, ML in trading becomes a losing experiment. Let's look at how to build a strategy that works in live markets.

Why ML in trading is harder than it looks

Non-stationarity—markets constantly change. A pattern that worked a year ago may stop working today. The model trains on the past, applies to a future whose distribution differs from the past.

Low signal-to-noise ratio—financial data has an extremely low signal-to-noise ratio. Most patterns found by the model are noise that happened to be "significant" in the training sample by chance.

Lookahead bias—if features accidentally use future data, the model learns information unavailable in reality. The backtest looks fantastic, live trading loses money.

Overfitting—a model with 100 parameters and 500 historical trades is almost certainly overfit. The solution is simple models and walk-forward validation.

How to avoid lookahead bias and overfitting

The key method is walk-forward validation. Unlike train/test split, we use a rolling window: the model is trained on a fixed period, tested on the next, then the window slides. This honestly evaluates the strategy's stability over time and eliminates lookahead at the data level.

Another important technique is feature engineering with strict backward shift. All indicators must be calculated using only past data, without peeking into the future. An example pipeline is below.

Feature engineering

Proper features are the foundation of an ML strategy. We combine technical indicators, price and volume derivatives, and market microstructure metrics.

Feature generation code

import pandas as pd
import numpy as np
from ta import trend, momentum, volatility

class FeatureEngineer:
    def generate_features(self, df: pd.DataFrame) -> pd.DataFrame:
        """df contains: open, high, low, close, volume"""

        features = pd.DataFrame(index=df.index)

        # === Technical indicators ===
        # Trend
        features['ema_9'] = trend.EMAIndicator(df.close, 9).ema_indicator()
        features['ema_21'] = trend.EMAIndicator(df.close, 21).ema_indicator()
        features['macd'] = trend.MACD(df.close).macd()
        features['macd_signal'] = trend.MACD(df.close).macd_signal()
        features['adx'] = trend.ADXIndicator(df.high, df.low, df.close).adx()

        # Momentum
        features['rsi_14'] = momentum.RSIIndicator(df.close, 14).rsi()
        features['stoch_k'] = momentum.StochasticOscillator(df.high, df.low, df.close).stoch()
        features['cci'] = momentum.CCIIndicator(df.high, df.low, df.close).cci()

        # Volatility
        features['atr'] = volatility.AverageTrueRange(df.high, df.low, df.close).average_true_range()
        features['bb_width'] = (
            volatility.BollingerBands(df.close).bollinger_hband() -
            volatility.BollingerBands(df.close).bollinger_lband()
        ) / df.close

        # === Price-derived features ===
        # Returns at different horizons
        for period in [1, 3, 6, 12, 24]:
            features[f'return_{period}h'] = df.close.pct_change(period)

        # Distance from moving averages (normalized)
        for period in [20, 50, 200]:
            ma = df.close.rolling(period).mean()
            features[f'dist_ma_{period}'] = (df.close - ma) / ma

        # === Volume features ===
        features['volume_ratio'] = df.volume / df.volume.rolling(20).mean()
        features['obv'] = (np.sign(df.close.diff()) * df.volume).cumsum()
        features['obv_ratio'] = features['obv'] / features['obv'].rolling(20).mean()

        # === Market microstructure ===
        features['high_low_range'] = (df.high - df.low) / df.close
        features['close_position'] = (df.close - df.low) / (df.high - df.low + 1e-10)

        return features.dropna()

An important nuance: all indicators that "look forward" in time must be shifted by 1 step back. The current candle's signal uses the previous candle's data—this eliminates lookahead.

Which model to choose: LightGBM or LSTM?

For structured data, the best baseline is Gradient Boosting (LightGBM). It trains quickly, is well interpretable via feature importance, and is robust to outliers. The LightGBM documentation recommends it for time series with many features.

import lightgbm as lgb
from sklearn.model_selection import TimeSeriesSplit

class DirectionPredictor:
    def __init__(self, horizon: int = 4):
        self.horizon = horizon  # predict direction after N candles
        self.model = None
        self.feature_cols = None

    def prepare_target(self, df: pd.DataFrame) -> pd.Series:
        """Target: 1 if price rises by X% over horizon, else 0"""
        future_return = df.close.shift(-self.horizon) / df.close - 1
        threshold = 0.005  # 0.5%
        return (future_return > threshold).astype(int)

    def train(self, features: pd.DataFrame, prices: pd.DataFrame):
        y = self.prepare_target(prices)

        # Align indices
        common_idx = features.index.intersection(y.dropna().index)
        X = features.loc[common_idx]
        y = y.loc[common_idx]

        # Walk-forward validation: train on first 70%, test on last 30%
        split = int(len(X) * 0.7)
        X_train, X_test = X.iloc[:split], X.iloc[split:]
        y_train, y_test = y.iloc[:split], y.iloc[split:]

        params = {
            'objective': 'binary',
            'metric': 'auc',
            'learning_rate': 0.05,
            'num_leaves': 31,
            'min_data_in_leaf': 50,
            'feature_fraction': 0.8,
            'bagging_fraction': 0.8,
            'bagging_freq': 5,
            'verbose': -1
        }

        train_data = lgb.Dataset(X_train, label=y_train)
        val_data = lgb.Dataset(X_test, label=y_test)

        self.model = lgb.train(
            params,
            train_data,
            valid_sets=[val_data],
            num_boost_round=500,
            callbacks=[lgb.early_stopping(50), lgb.log_evaluation(100)]
        )
        self.feature_cols = X.columns.tolist()

    def predict_proba(self, features: pd.DataFrame) -> float:
        X = features[self.feature_cols].iloc[-1:]
        return float(self.model.predict(X)[0])

LSTM for sequence modeling—if the hypothesis is that the sequence of events matters, LSTM can be more effective, but on daily and hourly data LightGBM often performs similarly.

import torch
import torch.nn as nn

class PriceLSTM(nn.Module):
    def __init__(self, input_size: int, hidden_size: int = 64, num_layers: int = 2):
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=input_size,
            hidden_size=hidden_size,
            num_layers=num_layers,
            batch_first=True,
            dropout=0.2
        )
        self.classifier = nn.Sequential(
            nn.Linear(hidden_size, 32),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(32, 1),
            nn.Sigmoid()
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        lstm_out, _ = self.lstm(x)
        last_output = lstm_out[:, -1, :]
        return self.classifier(last_output)

Walk-forward validation

Standard train/test split is unacceptable for time series. We use a rolling window:

def walk_forward_backtest(
    model_class,
    features: pd.DataFrame,
    prices: pd.DataFrame,
    train_window: int = 365,
    test_window: int = 30,
    step: int = 30
) -> pd.DataFrame:
    results = []
    n = len(features)

    for start in range(0, n - train_window - test_window, step):
        train_end = start + train_window
        test_end = train_end + test_window

        X_train = features.iloc[start:train_end]
        X_test = features.iloc[train_end:test_end]
        p_train = prices.iloc[start:train_end]
        p_test = prices.iloc[train_end:test_end]

        model = model_class()
        model.train(X_train, p_train)

        predictions = [model.predict_proba(X_test.iloc[:i+1]) for i in range(len(X_test))]
        period_results = simulate_trading(predictions, p_test)
        results.append(period_results)

    return pd.concat(results)

How to integrate an ML model into a trading bot

class MLTradingBot:
    def __init__(self, model: DirectionPredictor, threshold: float = 0.65):
        self.model = model
        self.threshold = threshold

    async def on_candle(self, candle: Candle):
        features = self.feature_eng.update(candle)
        prob_up = self.model.predict_proba(features)

        if prob_up > self.threshold and not self.has_position():
            await self.open_long()
        elif prob_up < (1 - self.threshold) and not self.has_position():
            await self.open_short()
        elif self.has_position():
            current_side = self.position.side
            if current_side == 'long' and prob_up < 0.5:
                await self.close_position("model_signal_weak")

A threshold of 0.65 means "only enter when the model is 65%+ confident." This reduces the number of trades but improves their quality. Additionally, we implement model drift monitoring and automatic retraining every 30–90 days.

ML bot development stages

Stage	Duration	Result
Data analysis and hypothesis	1-2 weeks	Data quality report, feature list
Feature engineering and baseline	2-3 weeks	Feature pipeline, simple model
Training and validation	2-4 weeks	Walk-forward backtest with metrics
Bot integration	1-2 weeks	Prediction module, risk management
Demo testing	2-4 weeks	Trade statistics, report

What's included in development

Formation of trading hypothesis and validation on historical data
Development of feature pipeline (technical indicators, on-chain metrics, order book imbalance)
Baseline model and walk-forward validation including transaction costs (0.1-0.2% per trade)
Selection and training of final model (LightGBM, LSTM, transformers)
Exchange integration via WebSocket/REST API (Binance, Bybit, OKX, etc.)
Model drift monitoring and automatic retraining every 30-90 days
Architecture documentation and team training
3-month warranty support

Common mistakes in ML trading

Mistake	Why it's dangerous	Solution
Lookahead bias in features	Unrealistic backtest	Always shift by 1 period
No transaction costs	Strategy loses live	Include 0.1-0.2% per trade
Ordinary train/test split	Lookahead at data level	Walk-forward only
Too many features	Overfitting guaranteed	Feature selection, L1 regularization
No model retraining	Degradation over time	Retrain every 30-90 days

An ML bot is not a set-and-forget solution. Markets drift, models degrade. It requires live model metric monitoring and periodic retraining. But with the right approach, an ML strategy provides a real edge.

Want a consultation on ML bot architecture for your strategy? Contact us—we'll evaluate your task and propose the optimal solution. Order the development of an adaptive strategy with a technical implementation guarantee.

Useful links

walk-forward validation — time series validation method
LightGBM documentation — official library documentation

Why exchange development requires deep domain expertise

We develop exchanges — not 'chart sites,' but matching engines that process thousands of orders per second without delay, route liquidity between pools, and guarantee that no user gains access to others' funds. Teams that start with the UI and postpone the engine 'for later' end up rewriting everything in six months in 90% of cases.

Order Book vs AMM: where most projects break

Centralized exchanges (CEX) are built around an order book + matching engine. Decentralized exchanges (DEX) either also use an order book (dYdX on StarkEx, Serum/OpenBook on Solana) or an AMM with concentrated liquidity (Uniswap v3/v4, Curve, Balancer). A classic mistake when developing a CEX is implementing the matching engine on top of a relational database with transactions for each match. PostgreSQL handles ~500 RPS without special effort, but at peak loads of 5,000–10,000 orders per second, it turns into a deadlock nightmare. The correct architecture: in-memory order book (Redis Sorted Sets or custom C++/Rust structure), asynchronous writing of matches to PostgreSQL via a queue (Kafka/RabbitMQ), and a separate settlement service that finally updates balances.

For DEX, the most painful problem is sandwich attacks and MEV. A pool with a plain xy=k AMM without slippage protection becomes a target for MEV bots within hours of launch. Uniswap v2 lost hundreds of millions of dollars in user liquidity. Solutions: integration with Flashbots Protect, a commit-reveal scheme for orders, or switching to TWAMM (Time-Weighted AMM) for large trades.

Concentrated liquidity and impermanent loss

Uniswap v3 introduced concentrated liquidity – LPs choose a price range in which to provide liquidity. Capital efficiency increased 4,000x compared to v2 for stable pairs. But implementing this mechanism correctly is non-trivial. The Uniswap v3 liquidity contract uses tick-based accounting: the price space is divided into discrete ticks (tick = log₁.0001(price)), each tick stores accumulated fee growth and liquidity delta. When creating a position, the lower and upper ticks are computed, and the contract recalculates all active positions at each swap. Storage layout is critical here – incorrect variable packing in slots easily adds 40–60% to swap gas cost.

We implemented a Uniswap v3 fork for a client on Polygon with a custom fee tier system. The initial version consumed 180k gas for a swap across 2 ticks. After slot packing of variables in Tick.Info and inlining several internal calls, it dropped to 112k gas. This reduced gas costs by 38% and saved the client substantial costs on fees monthly. The techniques applied are described in the Uniswap v3 Whitepaper and confirmed by our audit experience.

How a matching engine delivers performance

A production-ready matching engine is built according to the following scheme:

Order ingestion layer – WebSocket gateway (Go or Rust), accepts orders, validates signature, checks balance via Redis, queues them. Latency at this level must be <1ms.
Matching core – single-threaded event loop (eliminates race conditions without mutexes). In memory, we hold two Sorted Sets for each trading instrument: bids and asks. FIFO matching for limit orders, immediate-or-cancel for market orders. Throughput with a proper Rust implementation – 500k–1M matches per second on a single core.
Settlement service – reads matches from Kafka, atomically updates balances in PostgreSQL (UPDATE accounts SET balance = balance - $1 WHERE id = $2 AND balance >= $1). Optimistic locking via row versioning.
Withdrawal pipeline – separate service with cold/hot wallet architecture. The hot wallet holds 5–10% of total deposits, the rest is cold storage with multi-sig (Gnosis Safe or custom HSM). Automatic withdrawals only from hot wallet, large amounts require manual authorization.

Component	Technology	Latency / Throughput
Order gateway	Go + WebSocket	<1ms p99
Matching engine	Rust (in-memory)	500k+ orders/sec
Balance store	Redis (write-through)	<0.5ms
Settlement DB	PostgreSQL 14+	~50k TPS with partitioning
Event streaming	Apache Kafka	1M+ events/sec
Blockchain node	Geth / Solana validator	depends on chain

How our exchange development process ensures reliability

Smart contracts and gas optimization

For EVM-based DEX (Ethereum, Arbitrum, Optimism, Polygon), the entire critical path lives in Solidity. Main contracts: Pool, Factory, Router, PositionManager (for v3-like), and Quoter for off-chain calculations. Typical mistakes we see in audits:

Reentrancy via callback. Uniswap v3 uses flash swap with a callback (uniswapV3SwapCallback). If your router lacks a nonReentrant guard and you don't check msg.sender == pool, the contract gets drained via a nested call. This is not hypothetical – several v3 forks lost funds this way.

Oracle manipulation in AMM. If your contract uses the spot price from the pool for collateral calculation, it is front-runnable. Correct: TWAP over 30+ minutes (Uniswap v3 OracleLib) or an external oracle (Chainlink).

Unbounded loops in liquidity range. If a swap crosses many ticks in a row (price impact 80%+), gas may exceed the block limit. Need MAX_TICKS_CROSSED with partial fill and returning the remainder.

For Solana DEX (Anchor framework, Rust), the architecture is fundamentally different: account-based model, Program Derived Addresses (PDA) instead of storage, Cross-Program Invocations instead of internal calls. Solana's throughput (~3,000–4,000 TPS vs 15–30 on Ethereum mainnet) allows building on-chain order books – exactly what Phoenix DEX does.

Liquidity bootstrapping and aggregator integration

Launching a pool is not enough – you need to ensure liquidity at launch. Practical mechanisms:

Liquidity Bootstrapping Pool (LBP) – initial price is high, asset weights dynamically shift, creating selling pressure and even token distribution. Implemented in Balancer v2.
Initial Liquidity Offering via Uniswap v3 – adding liquidity in a narrow range around the initial price, then gradually expanding as volume grows. Requires active liquidity management or integration with Arrakis/Gamma.
Integration with 1inch, Paraswap, Li.Fi – aggregators bring traffic but require standard compliance: the pool must have correct getAmountsOut, support ERC-20 approval/permit, and not have custom transfer hooks that break the aggregator's routing.

Development process and deliverables

Analytics and design begin with choosing the architectural model: CEX with custodial storage, non-custodial DEX, or hybrid (off-chain order book + on-chain settlement, like dYdX v3). This decision determines everything – regulatory load, tech stack, team.

Development proceeds in layers: first smart contracts with full Foundry coverage (fuzzing, invariant testing), then backend services, then integration layer, and finally frontend. Testing includes fork testing on mainnet via Foundry – we reproduce real liquidity conditions, not synthetic ones.

Audit is mandatory before mainnet deployment. For DEX contracts, minimally one firm with manual review (Trail of Bits, Spearbit, Code4rena contest). For CEX custody, audit of key storage processes. We guarantee all contracts undergo formal verification and fuzzing testing (Echidna, Foundry invariant).

Estimated timelines

Exchange type	Timeframe
DEX (AMM, xy=k)	3 to 5 months
DEX with concentrated liquidity (v3-like)	6 to 10 months
CEX (matching engine + custody + trading UI)	8 to 14 months
Integration with existing protocol	4 to 8 weeks

Cost is calculated individually after a technical briefing: chain selection, throughput requirements, custodial model. Our certified engineers with 10+ years of experience will help you choose the optimal architecture and avoid common pitfalls. Contact our team for a detailed proposal.

Pitfalls to avoid at launch

Forgetting the price oracle in AMM. Spot price can be manipulated with a flash loan in one transaction. If your lending protocol uses the spot price from its own pool, that's a bug.
Hot wallet without limits. A CEX without daily limits on automatic withdrawals is an invitation for attackers. Compromising one key should lose at most 10% of total funds.
Absence of circuit breaker. A 40% price drop in 5 minutes should halt automatic liquidations or withdrawals until manual review. Without this, a cascading liquidation spiral destroys all TVL.
Incorrect decimal handling. USDC uses 6 decimals, WBTC – 8, most tokens – 18. Mixing without normalization leads to either precision loss or overflow. Solidity has no float; we work with fixed-point using FullMath (mulDiv with overflow protection).

Want to avoid these problems? Get a consultation — we will select the architecture for your project and provide exact timelines. Order exchange development with quality guarantee and ongoing support.