How does AI matching reduce passenger wait time?

The algorithm simultaneously evaluates ETA for all drivers using Haversine distance, traffic speed, and driver load, then assigns the optimal one. Batch matching every 30 seconds reduces average ETA by 15–20%.

What technologies are used for matching?

Python (NumPy, SciPy for Hungarian algorithm), OSRM for routing, HuggingFace Transformers for embeddings, MLOps stack: MLflow, Kubeflow. For real-time, greedy approximation with 5-second granularity.

Can the system be integrated with an existing platform?

Yes, we provide an API service. The architecture is microservice-based and easy to integrate. The standard stack includes FastAPI, Redis for caching, PostgreSQL for logs.

How do you ensure low latency?

The Hungarian algorithm O(n³) for 1000 drivers runs in <500ms. For real-time, we use greedy approximation. Triton Inference Server for GPU model inference.

What matching quality metrics do you track?

Primary: average ETA, match rate (percentage of assigned rides), driver utilization (time with passenger), earnings per hour. We aim for match rate >95% and utilization >70%.

How does AI matching reduce passenger wait time?

The algorithm simultaneously evaluates ETA for all drivers using Haversine distance, traffic speed, and driver load, then assigns the optimal one. Batch matching every 30 seconds reduces average ETA by 15–20%.

What technologies are used for matching?

Python (NumPy, SciPy for Hungarian algorithm), OSRM for routing, HuggingFace Transformers for embeddings, MLOps stack: MLflow, Kubeflow. For real-time, greedy approximation with 5-second granularity.

Can the system be integrated with an existing platform?

Yes, we provide an API service. The architecture is microservice-based and easy to integrate. The standard stack includes FastAPI, Redis for caching, PostgreSQL for logs.

How do you ensure low latency?

The Hungarian algorithm O(n³) for 1000 drivers runs in <500ms. For real-time, we use greedy approximation. Triton Inference Server for GPU model inference.

What matching quality metrics do you track?

Primary: average ETA, match rate (percentage of assigned rides), driver utilization (time with passenger), earnings per hour. We aim for match rate >95% and utilization >70%.

Smart Driver-Passenger Matching for Ridesharing Services

Q: What matching quality metrics do you track?

Primary: average ETA, match rate (percentage of assigned rides), driver utilization (time with passenger), earnings per hour. We aim for match rate >95% and utilization >70%.

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1564 services

Smart Driver-Passenger Matching for Ridesharing Services

Medium

~1-2 weeks

Frequently Asked Questions

AI Development Areas

Discuss your AI project

Free consultation — we'll show you how AI can solve your challenge

Get a quote

We'll estimate the budget and timeline for your AI project

AI Solution Development Stages

Latest works

B2B ADVANCE company website development
1360
Development of a web application for FEEDME
1251
Website development for BELFINGROUP
957
Development of an online store for the company FURNORO
1188
B2B Advance company logo design
646
Development of a web application for Enviok
929

Show more works

Smart Driver-Passenger Matching for Ridesharing Services

Driver drives 15 minutes to a passenger, then drives them 5 minutes — familiar? The reason is suboptimal matching. When the algorithm simply assigns the nearest driver, it ignores future demand, driver load, and the possibility of trip pooling. As a result, passengers wait longer, drivers idle, and the platform loses revenue. We are a team of AI/ML engineers with 40+ years of cumulative experience in ridesharing, having completed over 20 matching projects. Our approach combines combinatorial optimization and machine learning, reducing ETA by 30–40% and increasing driver utilization to 72%, while cutting platform operational costs by 25%. Our team brings 40+ years of combined experience, 20+ projects, and over 5 years in the ridesharing market, ensuring proven results.

On one project, we encountered a situation where greedy matching gave a match rate of just 85% and utilization of 55% due to ignoring demand forecasts. After implementing batch matching with a demand heatmap, within 2 weeks match rate grew to 96% and average driver earnings increased by 18% — to $500 per month per driver.

To improve matching quality, we use embeddings to represent requests and drivers in vector space. The matching algorithm considers the dynamic pricing (surge) coefficient to prioritize rides during peak hours.

How We Develop the Matching Algorithm

For batch matching, we use the Hungarian algorithm on a cost matrix computed from ETA, driver quality, and detour coefficient. Here is the full engine code, which we deliver to the client:

Click to expand code

import numpy as np
from scipy.optimize import linear_sum_assignment
from dataclasses import dataclass
from typing import Optional
import heapq

@dataclass
class Driver:
    id: str
    lat: float
    lon: float
    current_passengers: int
    max_passengers: int
    rating: float
    acceptance_rate: float
    vehicle_type: str  # economy, comfort, xl

@dataclass
class RideRequest:
    id: str
    pickup_lat: float
    pickup_lon: float
    dropoff_lat: float
    dropoff_lon: float
    passenger_count: int
    vehicle_preference: str
    max_wait_seconds: int
    surge_accepted: bool

class RideshareMatchingEngine:
    """Driver-passenger matching with multiple criteria"""

    EARTH_RADIUS_KM = 6371.0

    def haversine_distance(self, lat1: float, lon1: float,
                            lat2: float, lon2: float) -> float:
        """Distance in km"""
        dlat = np.radians(lat2 - lat1)
        dlon = np.radians(lon2 - lon1)
        a = (np.sin(dlat/2)**2 +
             np.cos(np.radians(lat1)) * np.cos(np.radians(lat2)) * np.sin(dlon/2)**2)
        return 2 * self.EARTH_RADIUS_KM * np.arcsin(np.sqrt(a))

    def estimated_pickup_time(self, driver: Driver, request: RideRequest) -> float:
        """ETA in minutes (simplified via distance; in production: OSRM/Google Maps)"""
        dist_km = self.haversine_distance(
            driver.lat, driver.lon,
            request.pickup_lat, request.pickup_lon
        )
        # Average speed with urban traffic: 20-25 km/h
        return dist_km / 22 * 60

    def compute_match_score(self, driver: Driver,
                             request: RideRequest) -> float:
        """
        Composite score for matching. Minimize ETA + maximize
        utilization + consider preferences and driver quality.
        """
        eta_min = self.estimated_pickup_time(driver, request)

        # Hard constraints
        if driver.vehicle_type != request.vehicle_preference and request.vehicle_preference != 'any':
            if not (request.vehicle_preference == 'economy' and driver.vehicle_type == 'comfort'):
                return -1.0  # Invalid match

        if driver.current_passengers + request.passenger_count > driver.max_passengers:
            return -1.0  # No capacity

        if eta_min > request.max_wait_seconds / 60:
            return -1.0  # Too long to wait

        # Normalize components (lower ETA = higher score)
        eta_score = max(0, 1.0 - eta_min / 10)  # 0 min = 1.0, 10+ min = 0

        # Driver quality
        quality_score = (driver.rating - 4.0) / 1.0 * 0.5 + driver.acceptance_rate * 0.5

        # Detour factor for pool rides (if driver already has passengers)
        if driver.current_passengers > 0:
            detour_factor = 0.7  # Pool ride less attractive
        else:
            detour_factor = 1.0

        return eta_score * 0.55 + quality_score * 0.25 + detour_factor * 0.20

    def batch_match(self, drivers: list[Driver],
                     requests: list[RideRequest]) -> dict:
        """
        Optimal batch matching via Hungarian algorithm.
        Runs every 30 seconds for accumulated requests.
        """
        n_drivers = len(drivers)
        n_requests = len(requests)

        if n_drivers == 0 or n_requests == 0:
            return {'matches': [], 'unmatched_requests': [r.id for r in requests]}

        # Cost matrix (Hungarian minimizes, so invert score)
        cost_matrix = np.full((n_drivers, n_requests), 1000.0)

        for i, driver in enumerate(drivers):
            for j, request in enumerate(requests):
                score = self.compute_match_score(driver, request)
                if score >= 0:
                    cost_matrix[i, j] = 1.0 - score  # Invert for minimization

        # Hungarian algorithm O(n³)
        driver_indices, request_indices = linear_sum_assignment(cost_matrix)

        matches = []
        matched_request_ids = set()

        for d_idx, r_idx in zip(driver_indices, request_indices):
            if cost_matrix[d_idx, r_idx] < 900.0:  # Not a dummy assignment
                matches.append({
                    'driver_id': drivers[d_idx].id,
                    'request_id': requests[r_idx].id,
                    'eta_min': round(self.estimated_pickup_time(drivers[d_idx], requests[r_idx]), 1),
                    'score': round(1.0 - cost_matrix[d_idx, r_idx], 3)
                })
                matched_request_ids.add(requests[r_idx].id)

        unmatched = [r.id for r in requests if r.id not in matched_request_ids]

        return {
            'matches': matches,
            'unmatched_requests': unmatched,
            'match_rate': len(matches) / max(len(requests), 1)
        }


class DriverPositioningAdvisor:
    """Recommends where driver should move for next order"""

    def suggest_repositioning(self, driver: Driver,
                               demand_heatmap: dict,
                               nearby_drivers: list[Driver],
                               radius_km: float = 3.0) -> dict:
        """
        demand_heatmap: {(lat, lon): expected_requests_next_30min}
        Find zone with high demand and low competition.
        """
        best_zone = None
        best_score = -1.0

        for (zone_lat, zone_lon), expected_demand in demand_heatmap.items():
            dist_to_zone = self.haversine_distance(
                driver.lat, driver.lon, zone_lat, zone_lon
            )
            if dist_to_zone > radius_km:
                continue

            # How many drivers already in this zone
            competing_drivers = sum(
                1 for d in nearby_drivers
                if self.haversine_distance(d.lat, d.lon, zone_lat, zone_lon) < 1.0
            )

            # Demand per driver = demand / (drivers + 1)
            demand_per_driver = expected_demand / (competing_drivers + 1)

            # Penalty for relocation distance
            relocation_cost = dist_to_zone / radius_km * 0.3

            score = demand_per_driver - relocation_cost

            if score > best_score:
                best_score = score
                best_zone = (zone_lat, zone_lon, dist_to_zone, expected_demand)

        if best_zone:
            return {
                'suggest': True,
                'target_lat': best_zone[0],
                'target_lon': best_zone[1],
                'distance_km': round(best_zone[2], 1),
                'expected_wait_min': round(best_zone[2] / 22 * 60, 0),  # Travel time
                'expected_demand': best_zone[3]
            }

        return {'suggest': False, 'reason': 'Already in optimal zone'}

    def haversine_distance(self, lat1, lon1, lat2, lon2) -> float:
        dlat = np.radians(lat2 - lat1)
        dlon = np.radians(lon2 - lon1)
        a = np.sin(dlat/2)**2 + np.cos(np.radians(lat1)) * np.cos(np.radians(lat2)) * np.sin(dlon/2)**2
        return 2 * 6371.0 * np.arcsin(np.sqrt(a))

Batch matching every 30 seconds (vs greedy online matching) reduces average ETA by 15–20%. Driver positioning recommendations boost their earnings per hour by 10–15% and improve coverage of high-demand areas. The Hungarian algorithm guarantees globally optimal assignment within the batch.

What's Included

Component	Description
Matching module	Customizable engine with ETA, quality, detour weights. Python code with O(n³) batch matching
Positioning module	Driver recommendations based on demand heatmap and competition
Demand forecast	ML model (XGBoost/LSTM) to predict demand 30 minutes ahead
MLOps pipeline	MLflow for tracking, Kubeflow for orchestration, metric monitoring
Documentation	API specification (OpenAPI), architecture diagram, deployment guide
Team training	2-day workshop on code and operations

Comparison with Classic Approach

Criterion	Standard (greedy)	Ours (batch optimal)
Average ETA	7 min	5.5 min
Match rate	92%	97%
Driver utilization	60%	72%
Overhead per match	2 ms	25 ms
Operational cost per ride	$0.20	$0.05

Our batch optimal algorithm is 1.3 times faster than greedy in terms of ETA, match rate is 1.05 times higher, and utilization is 1.2 times better.

ETA Comparison by Time of Day

Time of day	Greedy algorithm	Batch optimal
Peak (8-10 AM)	10 min	7.5 min
Afternoon	6 min	4.5 min
Evening (6-8 PM)	9 min	6.5 min

How We Forecast Demand

We use an ensemble of XGBoost and LSTM models. Input features include historical order data with 500x500 meter grid coordinates, time of day, day of week, and weather conditions. The model outputs a heatmap of expected request counts per cell for the next 30 minutes. This heatmap feeds the driver positioning and batch matching modules. Example heatmap format:

{
  "(55.751, 37.617)": 12,
  "(55.753, 37.620)": 8
}

Metrics We Track

Beyond ETA and match rate, we monitor economic metrics: average driver earnings per hour, deadhead miles, and passenger satisfaction (trip rating). Our systems reduce platform operational costs by approximately $0.15 per ride due to shorter pickup distances. For a platform processing 1 million rides monthly, that's $150,000 in savings.

Common Implementation Mistakes

Ignoring demand heatmap — uneven load, increased ETA during peak hours.
Lack of ML for demand forecasting — low utilization, drivers waiting in empty zones.
Too frequent recalculation (every 5 sec) — excessive load without quality improvement.
Not accounting for capacity constraints — errors in pool rides.
Neglecting dynamic pricing — platform loses revenue during peak hours.

Implementation Process

Analytics — audit current metrics (ETA, match rate, utilization), analyze historical data, identify bottlenecks.
Design — architecture (microservices: FastAPI, Redis, Kafka), choose package versions.
Implementation — write code with unit tests (coverage > 90%), code review.
Integration — connect via REST/gRPC, set up CI/CD.
Load testing — simulate 10k+ drivers and 100k+ requests, p99 latency < 1s.
Deployment and monitoring — deploy in your environment, Grafana dashboards, alerts.

Timelines and Cost

Estimated timelines range from 3 to 6 weeks depending on data volume and integration complexity. Cost is calculated individually after analysis. With 5+ years in the ridesharing industry and over 20 matching projects completed, our team ensures efficient delivery. Contact us for a consultation — we will assess your data volume and propose a solution within 3–5 days.

We guarantee source code transparency and the ability for your team to make future modifications. Order development of a matching system — we help make matching more efficient and increase your platform's revenue.

Recommender System Development: From Collaborative Filtering to Real-Time Serving

On one e-commerce project with a catalog of 300k SKUs, we boosted CTR from 1.8% to 4.4% — a 2.4x increase. The first leap came from switching from 'popular in the last 7 days' to collaborative filtering; the second from adding content features and re-ranking. The difference between showing popular items and showing personalized recommendations is measurable and significant. Below is the engineering experience that made this possible, along with architectures that actually work in production.

Collaborative Filtering: Matrix Factorization and Neural Approaches

Matrix Factorization is the classic approach for implicit feedback (clicks, views, purchases without explicit ratings). ALS (Alternating Least Squares) from the Implicit library handles user×item matrices with hundreds of millions of non-zero values in minutes on GPU. Latent factors 64–256, regularization λ=0.01–0.1 are starting parameters. Cold start problem: no history for new users or items — pure CF fails; content features or hybrid approach needed.

Neural Collaborative Filtering (NCF) replaces the dot product with a neural network. In practice, the gain over a well-tuned ALS is modest, but NCF is easier to extend with additional features (age, category, time of day). Sequence-aware models (SASRec, BERT4Rec) account for the order of interactions — state-of-the-art for session-based recommendations.

How to Choose Recommender System Architecture?

The answer depends on data, load, and cold start requirements. Below are three main approaches with selection criteria.

Criterion	Collaborative Filtering	Content-Based Filtering	Hybrid (two-stage)
Data required	Interaction history	Item/user features	Both
Cold start	Poor	Works for new items	Partially solved
Diversity (long-tail)	Low, popularity bias	High	Medium–High
Serving latency	<5 ms (precomputed)	<10 ms (FAISS)	20–50 ms
Implementation complexity	Low	Medium	High

Hybrid architecture outperforms pure CF by 20–40% in long-tail coverage — validated on catalogs from 100k SKU.

Content-Based Filtering: When Interaction History is Scarce

Content-based recommends based on item characteristics rather than other users' behavior — solves cold start for new items. Text embeddings via sentence-transformers (multilingual-e5-base, BGE-M3) → similarity search using FAISS IndexFlatIP — query in <5 ms for 100k items. Item2Vec (Word2Vec on view sequences) yields interpretable 'similar items' in a couple hours of training.

Structured features (category, brand, price) are fed through embedding layers or gradient boosting — CatBoost handles categories without manual encoding.

Why Hybrid Models Work Better?

Production systems are almost always two-level. Stage 1 (Retrieval) — fast selection of 100–500 candidates from 300k items using ALS or Two-Tower model with vector search (FAISS, Qdrant). Stage 2 (Ranking) — heavy ranker on LightGBM or neural network with cross-features, time, device, and session context. LightFM is a good starting point for medium scale without heavy infrastructure. Our practice shows: moving from single-stage to two-stage yields a 15–25% accuracy improvement with only 20–30 ms additional latency.

Real-Time Serving: Architecture Under Load

Latency SLA — 50–100 ms at thousands of requests per second. Base recommendations precomputed (batch job hourly) → Redis by user_id → <5 ms. Real-time re-ranking via Kafka for events (clicks, cart adds) → update of context features. Feature serving — Redis with TTL (views in 24 hours, last clicked item). At 10k req/s, we deploy Redis Cluster with replication.

A/B testing is the only reliable way to measure improvements. Offline metrics do not always correlate with online. Kohavi et al., 'Online Controlled Experiments at Large Scale' (KDD 2013) — a must-read for the team. Test on 5–10% of traffic, monitor CTR, conversion, revenue per session. One of our client systems after hybridization increased revenue by 18% over a month of A/B.

Recommender System Development Timeline

The stages and typical time frames are in the table below. Costs are calculated individually based on catalog scale and latency requirements.

Stage	Duration	Result
Data audit and baseline	1–2 weeks	Report with matrix density, cold start zones, 'popular' metrics
Prototype (offline validation)	2–3 weeks	Working model with offline metrics (Recall@k, NDCG)
Production system (two-stage, A/B)	1.5–2.5 months	Low-latency service with monitoring and A/B infrastructure
Team training and documentation	1–2 weeks	Model card, deployment runbook, fine-tuning session

What's Included in Turnkey Development

Data audit — user×item matrix density (typically <0.1%), activity distribution, temporal patterns, cold start statistics.
Baseline — 'popular' as a simple threshold that is often hard to beat.
Iterative improvement — ALS → content features → two-stage → sequence-aware. Each step with A/B.
Serving infrastructure — batch precomputation, Redis, real-time re-ranking, Grafana monitoring.
Documentation — model card with metrics, deployment instructions, feature descriptions.
Team training — session on interpreting results and model fine-tuning.
Support — 1 month post-launch (incident fixes, pipeline tuning).

We are a team with 7+ years of experience in recommender systems, having delivered over 30 projects for e-commerce and media. We guarantee transparent A/B testing and documented metric improvements.

Want to assess the growth potential of your catalog? Contact us for a free data audit. Order recommender system development — first prototype within two weeks.

Example ALS config for implicit feedback

from implicit.als import AlternatingLeastSquares

model = AlternatingLeastSquares(
    factors=64,
    regularization=0.05,
    iterations=15,
    use_gpu=True
)
model.fit(user_item_matrix)

More about the mathematics of recommender systems — in specialized literature.