Audio Source Separation
Source separation — extracts individual sound sources from mixed signal. Applications: music production (stems), speech processing (remove background music), video post-production, archival restoration.
Main Models
| Model | Separation Type | Quality (SDR) | Speed |
|---|---|---|---|
| Demucs v4 | Vocals/drums/bass/other | 9.0 dB | 1.5× realtime on GPU |
| Spleeter | 2/4/5 stems | 6.8 dB | 100× realtime |
| Open-Unmix | 4 stems | 7.2 dB | 10× realtime |
| BS-RoFormer | SOTA 2024 | 10.1 dB | 0.8× realtime |
SDR (Signal-to-Distortion Ratio) — higher is cleaner.
Demucs v4 Integration
from demucs.pretrained import get_model
from demucs.apply import apply_model
model = get_model("htdemucs")
sources = apply_model(model, wav[None])
# returns: drums, bass, other, vocals
Use Cases
Music production: remixing, karaoke, mastering Content: remove background music before STT, archival restoration Post-production: ADR, music extraction, video localization
Timeline: Demucs integration — 1–2 weeks. Full service with queue and UI — 3–4 weeks.







