AI Music and Audio Generation System Development

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Offered services

Showing 1 of 1 servicesAll 1566 services

Medium

~1-2 weeks

FAQ

AI Development Areas

Discuss your AI project

Free consultation — we'll show you how AI can solve your challenge

Get a quote

We'll estimate the budget and timeline for your AI project

AI Solution Development Stages

Latest works

B2B ADVANCE company website development
1212
Development of a web application for FEEDME
1161
Website development for BELFINGROUP
852
Development of an online store for the company FURNORO
1041
B2B Advance company logo design
561
Development of a web application for Enviok
822

Show more works

Development of AI-based music and audio generation systems

AI music generation automates the creation of background music, jingles, and sound effects for content, games, and advertising. It replaces stock music licensing and studio recording for those with basic requirements.

Platform Comparison

Platform	API	Type	Manageability	License
Suno v4	REST (limited)	Song + vocals	Text prompt	Varies by plan
Udio	REST	Song + vocals	High	Commercial
MusicGen (Meta)	Self-hosted	Instrumental	High	MIT/CC
AudioCraft	Self-hosted	Music + SFX	High	MIT
Stable Audio	REST/self	Instrumental	High	Commercial

AudioCraft / MusicGen — self-hosted

from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write
import torch

class MusicGenerator:
    def __init__(self, model_size: str = "medium"):
        # Размеры: small (300M), medium (1.5B), large (3.3B), melody
        self.model = MusicGen.get_pretrained(f"facebook/musicgen-{model_size}")
        self.model.set_generation_params(
            duration=30,        # секунды (max 30 для standard, до 120 через chunking)
            temperature=1.0,    # 0.5–1.5
            top_k=250,
            top_p=0.0,
            cfg_coef=3.0        # adherence to prompt
        )

    def generate(
        self,
        description: str,
        duration: int = 30,
        temperature: float = 1.0
    ) -> bytes:
        self.model.set_generation_params(duration=duration, temperature=temperature)

        wav = self.model.generate(
            descriptions=[description],
            progress=True
        )

        import io
        import torchaudio
        buf = io.BytesIO()
        torchaudio.save(buf, wav[0].cpu(), sample_rate=32000, format="mp3")
        return buf.getvalue()

    def generate_with_melody(
        self,
        description: str,
        melody_audio: bytes,
        duration: int = 30
    ) -> bytes:
        """Генерируем музыку по мотивам референсной мелодии"""
        import io
        import torchaudio
        melody_wav, sr = torchaudio.load(io.BytesIO(melody_audio))

        model = MusicGen.get_pretrained("facebook/musicgen-melody")
        model.set_generation_params(duration=duration)

        wav = model.generate_with_chroma(
            descriptions=[description],
            melody_wavs=melody_wav.unsqueeze(0),
            melody_sample_rate=sr,
            progress=True
        )

        buf = io.BytesIO()
        torchaudio.save(buf, wav[0].cpu(), sample_rate=32000, format="mp3")
        return buf.getvalue()

Sound Effect Generation (AudioGen)

from audiocraft.models import AudioGen

sfx_model = AudioGen.get_pretrained("facebook/audiogen-medium")
sfx_model.set_generation_params(duration=5)

def generate_sound_effect(description: str, duration: float = 3.0) -> bytes:
    sfx_model.set_generation_params(duration=duration)
    wav = sfx_model.generate(descriptions=[description])

    import io, torchaudio
    buf = io.BytesIO()
    torchaudio.save(buf, wav[0].cpu(), sample_rate=16000, format="wav")
    return buf.getvalue()

# Примеры: "forest ambience with birds", "robot beeping", "door creaking"

Contextual Applications

Application	Recommended Platform	Parameters
Background Music for Videos	MusicGen medium/large	"ambient, {mood}, {tempo}"
Jingle for advertising	Suno/Udio (with vocals)	Specific brand prompt
Game Sounds	AudioGen	Specific SFX Descriptions
Music for the mood of the scene	MusicGen melody	Reference + description
Podcast Intro/Outro	Stable Audio	"podcast intro, {genre}, 15 seconds"

FastAPI service

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()
music_gen = MusicGenerator("medium")

class MusicRequest(BaseModel):
    description: str
    duration: int = 30
    temperature: float = 1.0

@app.post("/generate/music")
async def generate_music(req: MusicRequest):
    audio = music_gen.generate(req.description, req.duration, req.temperature)
    return Response(content=audio, media_type="audio/mpeg")

Delivery time: Self-hosted MusicGen API – 1–2 days. Platform with multiple models, queue, and CDN storage – 2–3 weeks.