Stream Overlays: Text and Logo on Mobile Stream
Overlay a logo on stream seems simple — until you realize overlay must render not to screen but directly into video stream before encoder. UIKit/SwiftUI views don't help here: they draw in display pipeline unrelated to CVPixelBuffer going to VideoToolbox.
Where to Insert Overlay Correctly
Pipeline looks like:
AVCaptureVideoDataOutput → CMSampleBuffer → CVPixelBuffer → [overlay] → H.264 encoder → RTMP/SRT
Overlay applied to CVPixelBuffer before encoder. Two ways:
Metal (recommended). Create MTLTexture from CVPixelBuffer via CVMetalTextureCacheCreateTextureFromImage, render overlay on top via Metal render pass, write result back to CVPixelBuffer. Runs on GPU — CPU load minimal.
CoreImage. Use CISourceOverCompositing filter: overlay CIImage logo on CIImage from CMSampleBuffer. Simpler code, but on iPhone 12 and older at 1080p30 adds 4–6ms to frame processing on main CPU — on edge of drop.
For production projects with 1080p30 without drops requirement — only Metal.
Metal Implementation
class OverlayRenderer {
private let device: MTLDevice
private let commandQueue: MTLCommandQueue
private var textureCache: CVMetalTextureCache?
private var overlayTexture: MTLTexture? // pre-loaded logo
func apply(to pixelBuffer: CVPixelBuffer) -> CVPixelBuffer {
var cvTexture: CVMetalTexture?
CVMetalTextureCacheCreateTextureFromImage(
nil, textureCache!, pixelBuffer, nil,
.bgra8Unorm,
CVPixelBufferGetWidth(pixelBuffer),
CVPixelBufferGetHeight(pixelBuffer),
0, &cvTexture
)
guard let texture = CVMetalTextureGetTexture(cvTexture!) else { return pixelBuffer }
let commandBuffer = commandQueue.makeCommandBuffer()!
// render pass: main texture + overlayTexture on top
// ...
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
return pixelBuffer // modified in-place
}
}
Load logo (overlayTexture) once on session start from PNG with alpha channel. Don't load UIImage per frame — 2–3ms allocation overhead per call.
Text Overlays: Separate Problem
Static text (channel name) — just Metal texture prepared once via CoreText. Problem with dynamic text: viewer count, timer, donation messages. Can't render via Metal directly — Metal doesn't work with text, only geometry and textures.
Solution: create offscreen CALayer with CATextLayer, render to UIGraphicsImageRenderer, get UIImage, convert to MTLTexture. Do in background thread no more than once per 500ms for counters and on event for donation messages. On screen text updates smoothly, in stream goes without delay.
Android: Analogous Approach via OpenGL ES / Vulkan
Android equivalent — SurfaceTexture + OpenGL ES 2.0. Camera renders to SurfaceTexture, overlay via GLES20.glBlendFunc(GLES20.GL_SRC_ALPHA, GLES20.GL_ONE_MINUS_SRC_ALPHA), result goes to MediaCodec via Surface. Vulkan more powerful but supported from Android 7+ and requires significantly more boilerplate — justified only with complex effects.
Overlay Positioning and Adaptation
Store logo position as relative coordinates (0.0–1.0 of frame size), not absolute pixels. Allows correct work on resolution or orientation change without logic recalculation. On landscape rotation — recalculate overlayRect in Metal render pass automatically.
Fade-in/fade-out for donation text via alpha channel change in MTLTexture between frames — smooth appearance over 15–20 frames (0.5–0.7 seconds).
Timeline
Static logo + Metal pipeline on iOS: 1–1.5 weeks. Dynamic text, overlay animations, iOS + Android support: 3–4 weeks. Cost calculated individually.







