Implementing Audio Recording in Mobile Applications
Voice messages, dictaphone, interview recording in journalism app — one task, different requirements. Voice message needs AAC/M4A, 32 kbps, small file. Podcast dictaphone — WAV or FLAC, 44100 Hz, lossless. Start with audio session, otherwise you'll be confused with conflicts later.
Audio Session and Categories
iOS. AVAudioSession — central management object. For recording:
let session = AVAudioSession.sharedInstance()
try session.setCategory(.playAndRecord,
mode: .default,
options: [.defaultToSpeaker, .allowBluetooth])
try session.setActive(true)
.playAndRecord allows simultaneous playback and recording. .allowBluetooth enables recording from AirPods. Without .defaultToSpeaker — audio playback goes to ear speaker, not main speaker.
Problem occurring in production: another app (navigator, music player) hijacks session. Subscribe to AVAudioSession.interruptionNotification, on .began pause recording, on .ended with .shouldResume — resume.
Android. AudioRecord for direct PCM data access. MediaRecorder — simpler but less format control. For most tasks MediaRecorder suffices:
mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC)
mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.MPEG_4)
mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AAC)
mediaRecorder.setAudioSamplingRate(44100)
mediaRecorder.setAudioEncodingBitRate(128000)
VOICE_COMMUNICATION instead of MIC enables echo cancellation and noise suppression at system level — useful for voice messages.
Sound Level Visualization
Amplitude waveform indicator — user sees recording is happening. On iOS: AVAudioRecorder.averagePower(forChannel: 0) returns dB (from -160 to 0). Normalize to 0..1:
let power = recorder.averagePower(forChannel: 0)
let level = pow(10, power / 20) // dBFS → linear
Poll via Timer.scheduledTimer(withTimeInterval: 0.05) — 20 fps for smooth animation.
On Android: MediaRecorder.getMaxAmplitude() — maximum amplitude since last call (0–32767). Do Handler.postDelayed every 50 ms.
For Flutter: record (pub.dev) provides onAmplitudeChanged stream.
Waveform During Playback
Analyze recorded file offline: read PCM samples, split into N-sample chunks, take RMS of each. Get array of float values — draw via Canvas or Path. On iOS need AVAssetReader + AVAssetReaderTrackOutput with kAudioFormatLinearPCM.
Formats and Compatibility
| Format | 1 minute size | Compatibility | Use |
|---|---|---|---|
| AAC (M4A) | ~480 KB | iOS, Android, Web | voice messages |
| MP3 | ~960 KB | everywhere | general case |
| WAV (PCM) | ~10 MB | everywhere | lossless, dictaphone |
| FLAC | ~3–5 MB | Android native, iOS 11+ | lossless, compact |
| OGG/Opus | ~300 KB | Android, Web | optimal for VoIP |
Background Recording
If user minimizes app — recording should continue. iOS: UIBackgroundModes: audio in Info.plist, AVAudioSession stays active. Android: ForegroundService with type microphone (android:foregroundServiceType="microphone" in manifest, mandatory with Android 10). Notification with "Stop" button — standard.
Without ForegroundService on Android 9+ system kills process after several minutes in background. On iOS without UIBackgroundModes, recording stops 3 seconds after minimizing.
Timeline
Recording with level visualization and file save — 1–2 days. Full dictaphone with waveform, pause, renaming and background recording — 3–4 days.







