Face tracking implementation in AR app

TRUETECH is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
Face tracking implementation in AR app
Complex
~3-5 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    756
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    624
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1052
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    862
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    445

Implementing Face Tracking and Recognition in AR Applications

Face tracking in mobile applications is mature technology with clearly defined capabilities and limitations. ARKit with TrueDepth camera gives face depth map with millimeter precision. ARCore AugmentedFace and MediaPipe work on RGB camera and are slightly worse at motion, but work on any device. Choice depends on task — and it's easy to choose wrong stack.

ARKit Face Tracking: What Exactly We Get

ARFaceTrackingConfiguration requires iPhone X or newer (TrueDepth front camera). Returns ARFaceAnchor:

geometryARFaceGeometry with 1220 vertices and 2304 triangles. Face mesh in real-world scale (meters). Updates ~30 times per second. Each vertex has fixed index — can address nose tip specifically (vertex ~9), mouth corners (~37, ~45), pupils.

blendShapes — dictionary of 52 AR face blend shape coefficients. browDownLeft, eyeBlinkLeft, jawOpen, mouthSmileLeft, etc. Each is Float from 0 to 1. This is foundation for face-driven animation (morph targets, 3D avatars) and expression recognition.

leftEyeTransform, rightEyeTransform — position and orientation of each eye. For eye tracking and gaze direction.

func session(_ session: ARSession, didUpdate anchors: [ARAnchor]) {
    guard let faceAnchor = anchors.first as? ARFaceAnchor else { return }

    let blinkLeft = faceAnchor.blendShapes[.eyeBlinkLeft]?.floatValue ?? 0
    let jawOpen = faceAnchor.blendShapes[.jawOpen]?.floatValue ?? 0

    // Control UI with blink/mouth
    if blinkLeft > 0.7 { triggerAction() }
}

ARCore AugmentedFace and MediaPipe

ARCore AugmentedFace (iOS not supported, Android only): 468 face mesh points via ML model on RGB camera. AugmentedFace.RegionTypeNOSE_TIP, FOREHEAD_RIGHT, FOREHEAD_LEFT for key points. Fewer points than ARKit, no depth map, but works on 85% of Android flagships without special sensor.

MediaPipe Face Landmark Task — cross-platform variant (iOS, Android, Web). 478 points. Works via VisionImage / MPImage. Open source, free. For tasks without strict realtime requirements (photo analysis, static filters) — excellent choice. For 30fps live camera — requires device with Neural Engine (iPhone) or modern Android ML accelerator.

Classification and Expression Recognition

Basic tasks on blendShapes without ML:

  • Smile: mouthSmileLeft + mouthSmileRight > 0.5
  • Wink: eyeBlinkLeft > 0.85 with eyeBlinkRight < 0.3
  • Surprise: eyeWideLeft + eyeWideRight > 1.2 + browInnerUp > 0.5
  • Mouth open: jawOpen > 0.4

This works for simple triggers — game mechanics, hands-free interface control. For emotion recognition (joy, sadness, anger) — need ML classification on top of blendShapes. CreateML allows training MLMultiArrayClassifier on recorded blendShape sequences.

Recognizing Specific Person

Face Recognition (identity verification) is fundamentally different task, not covered by face tracking. For identification: Vision framework VNDetectFaceRectanglesRequest + VNRecognizeAnimalsRequest → face embedding via CoreML model (ArcFace, FaceNet, or Apple's own). Compare embedding vectors with database.

In iOS 15+ — LocalAuthentication with LAContext.evaluatePolicy(.deviceOwnerAuthenticationWithBiometrics) for Face ID. This isn't SDK for your logic — this is system biometry. Using system Face ID for user verification in app is simpler and more secure than building your own.

Latency and Performance

ARKit face tracking + 3D mask + environment occlusion on iPhone 12 — ~8–12% CPU, ~30% GPU at 60fps UI. On iPhone XR (A12) — consumption higher, sometimes thermal throttling on long sessions. Monitor via os_signpost + Instruments → Metal System Trace.

Two cameras simultaneously (front + rear) — not supported via standard AR configurations. For selfie AR need to check ARFaceTrackingConfiguration.supportedVideoFormats — verify available resolutions.

Timeline

Face tracking with basic blendShape triggers (game mechanics, hands-free control) — 1–2 weeks. Face tracking + 3D mask/accessories + video recording — 2–4 weeks. Expression recognition via ML classifier — plus 2–3 weeks. Cost calculated individually.