AI Podcast Transcription and Summarization

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Podcast Transcription and Summarization
Simple
from 1 business day to 3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

AI-based podcast transcription and summarization. Automatic podcast transcription addresses SEO (text content for search engines), accessibility, shownote creation, and content sharing on social media. Whisper large-v3 delivers a WER of 4–8% on clean, studio-quality recordings. ### Basic Pipeline

import whisper
from openai import AsyncOpenAI

async def transcribe_and_summarize_podcast(audio_path: str) -> dict:
    # Транскрибация
    model = whisper.load_model("large-v3")
    result = model.transcribe(
        audio_path,
        language="ru",
        task="transcribe",
        verbose=False,
        word_timestamps=True
    )
    transcript = result["text"]
    segments = result["segments"]  # [{start, end, text}, ...]

    # Генерация shownotes через GPT-4o
    client = AsyncOpenAI()
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "Создай shownotes для подкаста: краткое описание эпизода (3-5 предложений), ключевые темы списком, временные метки для основных тем в формате MM:SS."
        }, {
            "role": "user",
            "content": transcript[:6000]
        }]
    )

    # Временные метки ключевых тем
    chapters = extract_chapters(segments)

    return {
        "transcript": transcript,
        "shownotes": response.choices[0].message.content,
        "chapters": chapters,
        "duration_sec": segments[-1]["end"] if segments else 0
    }

def extract_chapters(segments: list) -> list[dict]:
    """Выделяем тематические блоки по паузам и семантике"""
    chapters = []
    # Ищем паузы > 3 секунды как границы глав
    for i in range(1, len(segments)):
        gap = segments[i]["start"] - segments[i-1]["end"]
        if gap > 3.0:
            chapters.append({
                "timestamp": int(segments[i]["start"]),
                "text": segments[i]["text"][:80]
            })
    return chapters
```### RSS integration for automatic processing```python
import feedparser
import httpx

async def process_podcast_feed(rss_url: str) -> list[dict]:
    feed = feedparser.parse(rss_url)
    results = []

    for entry in feed.entries[:5]:  # последние 5 эпизодов
        audio_url = next(
            (enc.href for enc in entry.enclosures if enc.type.startswith("audio")),
            None
        )
        if not audio_url:
            continue

        async with httpx.AsyncClient() as client:
            audio_data = await client.get(audio_url)

        with open(f"/tmp/{entry.id}.mp3", "wb") as f:
            f.write(audio_data.content)

        result = await transcribe_and_summarize_podcast(f"/tmp/{entry.id}.mp3")
        result["title"] = entry.title
        result["published"] = entry.published
        results.append(result)

    return results
```Whisper processes one hour of audio in approximately 3-4 minutes on a GPU (RTX 3090). On a CPU, it takes about 30-40 minutes. For regular podcast processing, cloud inference via the API is sufficient (OpenAI Whisper API: $0.006/min). Estimated processing time: a single podcast processing script takes 1-2 days. A service with RSS monitoring and shownotes publishing takes 1-2 weeks.