Meeting Transcript Summarization Implementation
Meeting transcription is raw material. A 60-minute meeting yields 8–12 thousand words of text, of which 80% is context, repetitions, and conversational patterns. Summarization task: extract semantic core in seconds.
Summarizer Architecture
The pipeline receives transcription text (plain text or structured JSON with speaker labels) and returns structured summary:
[Transcript]
→ [Preprocessing: chunk by 3000 tokens]
→ [Map: summarize each chunk]
→ [Reduce: synthesize final summary]
→ [Structuring: topics, decisions, next steps]
For meetings up to 30 minutes (< 6000 tokens)—direct prompt without map-reduce.
Prompt and Output Format
Optimal meeting summarization output format:
## Brief Summary (2–3 sentences)
## Key Topics
## Decisions Made
## Open Questions
## Participants and Their Positions
Models: GPT-4o-mini for standard meetings (cost ~$0.002 per hour), GPT-4o for meetings with dense technical content. Latency: 5–15 seconds per typical meeting.
Integration with Sources
- Zoom — Zoom AI Companion API or Download recordings API + Whisper for transcription
- Google Meet — Google Meet API + Speech-to-Text
- Microsoft Teams — Graph API transcripts
- Fireflies.ai / Otter.ai — webhook with ready transcription
Result is saved to Notion, Confluence, Jira, or corporate wiki—via respective APIs.







