Setting up hand and gesture tracking in AR games

Our video game development company runs independent projects, jointly creates games with the client and provides additional operational services. Expertise of our team allows us to cover all gaming platforms and develop an amazing product that matches the customer’s vision and players preferences.
Showing 1 of 1 servicesAll 242 services
Setting up hand and gesture tracking in AR games
Complex
~7 business days
FAQ
Our competencies
What are the stages of Game Development?
Latest works
  • image_games_mortal_motors_495_0.webp
    Game development for Mortal Motors
    663
  • image_games_a_turnbased_strategy_game_set_in_a_fantasy_setting_with_fire_and_sword_603_0.webp
    A turn-based strategy game set in a fantasy setting, With Fire and Sword
    859
  • image_games_second_team_604_0.webp
    Game development for the company Second term
    490
  • image_games_phoenix_ii_606_0.webp
    3D animation - teaser for the game Phoenix 2.
    533

Hand and Gesture Tracking Setup in AR Games

When a user sees their real hands over AR content and can interact with virtual objects without controllers — that's a different level of immersion. But under the hood this is one of the most technically non-trivial tasks in AR development: tracking 21 points on each hand in real-time, recognizing gestures from a continuous stream of poses, and doing all this with changing lighting, partial hand occlusion, and device movement.

Hardware and software foundation

For mobile AR (iOS/Android), the main path is AR Foundation with XRHandSubsystem, introduced in AR Foundation 5.x. On iOS it uses ARKit Hand Tracking (available on iPhone X with A12 and newer), on Android it's ARCore, but with an important limitation: ARCore doesn't have built-in hand tracking in the main SDK. For Android hand tracking you need either a device with Qualcomm Snapdragon Spaces support, or integrate third-party solutions — MediaPipe Hands (works via plugin or native integration) or XR Hands package from Unity with a custom provider.

For headsets without controllers: Meta Quest 2/3 — through OVR Hand components from Meta XR SDK or through OpenXR Hand Tracking Extension (XR_EXT_hand_tracking). The second path is preferable for cross-platform projects — one code works on Quest, Pico, HoloLens 2 without conditional compilation for each SDK.

HoloLens 2 — through Microsoft Mixed Reality Toolkit (MRTK3), which provides abstraction over WMR hand tracking and already includes ready-made components for pinch, grab, poke interactions.

Why gesture recognition is harder than it seems

Gesture recognition is not simply "check if the palm is closed". It's classification of temporal pose patterns accounting for transition states and tracking noise.

Problem one: joint jitter. Even a stationary hand gives position fluctuation of joints by ±2–5 mm due to sensor noise. If you check the "pinch" gesture by raw positions of thumb and index finger — the trigger will fire randomly. Solution: low-pass filter on position of each joint (Kalman filter or simple exponential moving average with alpha ≈ 0.3–0.5) before classification.

Problem two: threshold hysteresis. One threshold for entering and exiting a gesture state — the path to state flickering. Correct approach: threshold for activating a gesture (e.g., pinch_distance < 15 mm), and a separate, wider threshold for deactivation (pinch_distance > 25 mm). This is standard technique, but often forgotten.

Problem three: temporal consistency. A gesture must be held for minimum N frames (usually 3–5 at 30 fps) before being recognized. This filters out random matches during pose transitions.

XR Interaction Toolkit with XR Hands package has HandShape and GestureDetector components implementing part of this logic. But for non-standard gestures (e.g., "draw a circle in the air" or "double pinch") you need custom logic — a state machine or sequence recognizer with temporal windows.

Integrating hand tracking with game objects

Once tracking is set up, you need to bind hands to game logic. Typical AR interactions: near interaction (touch/press object with hand) and far interaction (ray from palm or finger).

For near interaction the key component is proper Poke Interactor setup from MRTK or custom collider on finger tip with isTrigger = true. Main problem: the speed of finger tip movement during fast motion can be such that in one frame it moves 5–10 cm, passing through thin objects without trigger activation. Solution: sweep test (SphereCast along finger movement vector) instead of simple collider.

For AR games on mobile devices it's important to account for the hand periodically leaving the camera's field of view. You need to implement graceful degradation: when tracking is lost, interrupt current interaction correctly, not leave the object "hanging" in air with last known positioning.

Setup stages

We start with choosing SDK for target platforms and devices. This determines 80% of further architecture.

Then basic hand tracking setup: importing joints, skeleton visualization for debugging, checking tracking quality on target devices under different lighting conditions.

Next: implementing gestures (starting with pinch as basic interaction trigger), integration with interactable objects, fine-tuning filters and thresholds.

Final stage — testing with real users. No developer can predict all the ways people hold their hands.

Task Estimated timeline
Basic hand tracking (pinch + grab) 1–2 weeks
Custom gesture set (5–10 gestures) 2–4 weeks
Complete interaction system without controllers 4–8 weeks

Cost is calculated after analyzing platforms and required interactions.