AI Frame Interpolation Implementation

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Frame Interpolation Implementation
Medium
~2-3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

AI-Based Video Frame Interpolation

Converting 24fps → 60fps or 30fps → 120fps by frame duplication causes stuttering on fast motion. AI frame interpolation synthesizes intermediate frames using optical flow — result is smoother than any mechanical method.

RIFE — Practical Tool

RIFE (Real-Time Intermediate Flow Estimation) is the fastest open-source method. RTX 3080, 1080p: ~30 frames/second at 2x interpolation.

import torch
import numpy as np
import cv2
from pathlib import Path

# Load RIFE model (IFNet)
from model.RIFE_HDv3 import Model

def interpolate_video_rife(
    input_path: str,
    output_path: str,
    multiplier: int = 2,    # 2x, 4x, 8x — only powers of two in RIFE
    scale: float = 1.0,     # scale for optical flow (0.5 with weak GPU)
    fp16: bool = True
) -> None:
    device = torch.device('cuda')
    model = Model()
    model.load_model('train_log', -1)
    model.eval().device(device)

    cap = cv2.VideoCapture(input_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    w   = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    h   = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    out_fps = fps * multiplier
    writer = cv2.VideoWriter(
        output_path,
        cv2.VideoWriter_fourcc(*'mp4v'),
        out_fps, (w, h)
    )

    ret, prev_frame = cap.read()
    while ret:
        ret, curr_frame = cap.read()
        if not ret:
            break

        # Convert to tensors
        I0 = torch.from_numpy(prev_frame).permute(2,0,1).float() / 255.0
        I1 = torch.from_numpy(curr_frame).permute(2,0,1).float() / 255.0

        if fp16:
            I0 = I0.half()
            I1 = I1.half()

        I0 = I0.unsqueeze(0).to(device)
        I1 = I1.unsqueeze(0).to(device)

        # Pad to multiple of 32
        pad_h = (32 - h % 32) % 32
        pad_w = (32 - w % 32) % 32
        I0 = torch.nn.functional.pad(I0, [0, pad_w, 0, pad_h])
        I1 = torch.nn.functional.pad(I1, [0, pad_w, 0, pad_h])

        writer.write(prev_frame)

        # Synthesize (multiplier-1) intermediate frames
        for i in range(1, multiplier):
            t = i / multiplier
            with torch.no_grad():
                middle = model.inference(I0, I1, scale=scale)
            mid_np = (middle[0].float().cpu().permute(1,2,0).numpy()
                     * 255).astype(np.uint8)
            writer.write(mid_np[:h, :w])

        prev_frame = curr_frame

    writer.write(prev_frame)
    cap.release()
    writer.release()

EMA-VFI for Complex Scenes

RIFE loses quality on scenes with occlusions and nonlinear motion. EMA-VFI (Event-based Motion-Aware VFI) is more accurate but 3–4x slower.

Typical Artifacts and Solutions

Ghosting — semi-transparent double of object. Occurs on fast motions where optical flow fails. Solution: reduce scale or switch to EMA-VFI.

Warping artifacts — distortion of text and sharp edges. RIFE handles text on screens poorly. Solution: mask static regions and don't interpolate them.

Flickering on shot cuts — RIFE doesn't detect scene changes and synthesizes frame between different scenes. Preprocessing required: shot detection via PySceneDetect.

from scenedetect import detect, ContentDetector, AdaptiveDetector

def find_scene_cuts(video_path: str, threshold: float = 27.0) -> list[int]:
    """
    Returns frame numbers where scene changes occur.
    """
    scenes = detect(video_path, ContentDetector(threshold=threshold))
    return [int(scene[0].get_frames()) for scene in scenes]
Method Speed (1080p) Quality VRAM
RIFE ~30 FPS (2x) Very Good 6–8 GB
EMA-VFI ~8 FPS (2x) Excellent 8–10 GB
DAIN ~2 FPS (2x) Excellent 11 GB
Super-SloMo ~3 FPS (8x) Good 6 GB
Task Timeline
Basic frame interpolation (2x-4x) 1–2 weeks
Production pipeline with shot detection 3–4 weeks
8x interpolation with quality assurance 6–8 weeks