AI Image Super Resolution and Upscale Implementation

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Image Super Resolution and Upscale Implementation
Simple
~2-3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

AI Super-Resolution — Image Upscaling

Bicubic interpolation provides 4x upscaling with blurring. AI super-resolution recovers details: skin texture, text on signs, fabric structure. The difference is visible when comparing PSNR: bicubic — 28–30 dB, Real-ESRGAN — 32–36 dB on photographs.

Real-ESRGAN — Practical Standard

import torch
import numpy as np
from PIL import Image
from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer

def upscale_image(
    image_path: str,
    scale: int = 4,
    model_name: str = 'RealESRGAN_x4plus',  # or 'RealESRGAN_x4plus_anime_6B'
    tile_size: int = 512,    # for large images — tile-based processing
    half_precision: bool = True
) -> np.ndarray:
    """
    tile_size=512 with VRAM 6GB, tile_size=0 (whole image) with VRAM 24GB.
    half=True — FP16, saves ~50% VRAM.
    """
    model = RRDBNet(
        num_in_ch=3, num_out_ch=3,
        num_feat=64, num_block=23, num_grow_ch=32,
        scale=scale
    )
    upsampler = RealESRGANer(
        scale=scale,
        model_path=f'weights/{model_name}.pth',
        model=model,
        tile=tile_size,
        tile_pad=10,      # tile overlap for smooth seams
        pre_pad=0,
        half=half_precision,
        device='cuda'
    )

    img = np.array(Image.open(image_path).convert('RGB'))
    output, _ = upsampler.enhance(img, outscale=scale)
    return output

GFPGAN for Face Restoration

Real-ESRGAN on portraits sometimes creates artifacts on faces. GFPGAN adds face restoration on top of SR:

from gfpgan import GFPGANer

def restore_face_photo(
    degraded_image: np.ndarray,
    upscale: int = 2,
    arch: str = 'clean',         # 'clean' | 'RestoreFormer'
    channel_multiplier: int = 2,
    weight: float = 0.5          # 0=pure GFPGAN, 1=without face enhancement
) -> np.ndarray:
    """
    weight=0.5 — compromise between restoration and feature preservation.
    At weight=0 faces look "glossy".
    """
    restorer = GFPGANer(
        model_path='weights/GFPGANv1.4.pth',
        upscale=upscale,
        arch=arch,
        channel_multiplier=channel_multiplier,
        bg_upsampler=None   # can pass RealESRGANer for background
    )

    _, _, restored = restorer.enhance(
        degraded_image,
        has_aligned=False,
        only_center_face=False,
        paste_back=True,
        weight=weight
    )
    return restored

Metrics and Model Comparison

Model PSNR (Set5 4x) SSIM Speed 1080p→4K Application
Bicubic 28.42 0.810 Instant Baseline
SRCNN 30.48 0.862 Fast Outdated
ESRGAN 32.73 0.901 ~2s RTX3080 Photos
Real-ESRGAN x4+ 33.98 0.918 ~3s RTX3080 Photos, text
SwinIR-L 34.97 0.932 ~8s RTX3080 Maximum quality
GFPGAN v1.4 ~4s RTX3080 Portraits

PSNR is not the only criterion: human perception correlates with LPIPS (perceptual loss). Real-ESRGAN, despite lower PSNR than SwinIR, often looks better subjectively due to higher-frequency details.

Batch Processing of Large Volumes

from pathlib import Path
import torch
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms

class ImageDataset(Dataset):
    def __init__(self, image_paths: list[str], size: int = 256):
        self.paths = image_paths
        self.transform = transforms.Compose([
            transforms.Resize((size, size)),
            transforms.ToTensor()
        ])

    def __len__(self): return len(self.paths)

    def __getitem__(self, idx):
        img = Image.open(self.paths[idx]).convert('RGB')
        return self.transform(img), self.paths[idx]

def batch_upscale_pipeline(
    input_dir: str,
    output_dir: str,
    batch_size: int = 4,   # with VRAM 12GB and tile_size=0
    scale: int = 4
):
    paths = list(Path(input_dir).glob('*.{jpg,jpeg,png}'))
    Path(output_dir).mkdir(exist_ok=True)

    # For batch inference use direct forward pass
    # (RealESRGANer does not support batches, requires direct model call)
    model = RRDBNet(
        num_in_ch=3, num_out_ch=3,
        num_feat=64, num_block=23, num_grow_ch=32, scale=scale
    )
    model.load_state_dict(
        torch.load(f'weights/RealESRGAN_x4plus.pth')['params_ema']
    )
    model.eval().cuda().half()

    for path in paths:
        with torch.no_grad(), torch.cuda.amp.autocast():
            img_t = transforms.ToTensor()(
                Image.open(path).convert('RGB')
            ).unsqueeze(0).half().cuda()
            out = model(img_t).squeeze(0).float().cpu()
            out_img = transforms.ToPILImage()(out.clamp(0, 1))
            out_img.save(
                Path(output_dir) / (Path(path).stem + '_4x.png')
            )

Limitations and Common Issues

  • Texture hallucinations — Real-ESRGAN can add non-existent text on signs. Unacceptable in forensics applications
  • OOM on large images — 12-megapixel photo at 4x upscale = 192Mp, doesn't fit in memory. Solution: tile_size=512 with tile_pad=10
  • JPEG artifacts — JPEG blockiness is amplified by SR. Preprocessing: JPEG-aware denoising (nf_denoise from BasicSR)

Timelines

Task Timeline
SR API service (Real-ESRGAN) 1–2 weeks
Fine-tuning on specific domain 4–6 weeks
Custom SR model from scratch 10–16 weeks