Implementing Multi-Agent AI System in a Mobile Application
Single agent fails when task requires specialization. Orchestrator agent, researcher agent, executor agent, critic agent—each does its part, coordinator brings results together. On mobile—rare client-side, but as backend pattern with mobile UI—working architecture.
When Multi-Agent Needed
One agent with 10+ tools degrades: context window overflows, model confuses when to use which tool. Multi-agent system splits task:
- Orchestrator — takes task from user, decomposes to subtasks, delegates to specialized agents
- Research Agent — search and collect information (web search, RAG, database)
- Action Agent — execute actions (API calls, booking)
- Critic Agent — verify results for correctness and safety
Classic mobile app case: trip planning agent. Orchestrator gets "organize business trip to Warsaw for 3 days" → Research Agent searches flights and hotels → Action Agent books → Critic Agent verifies date correctness and cost → Orchestrator forms final plan.
Architectural Patterns: Supervisor vs Pipeline
Supervisor (Star topology). Central coordinator manages specialized agents. Each agent—separate LLM instance with own system prompt. Coordinator sends tasks and collects results.
Pipeline (Sequential). Agents arranged in chain: one's output—next's input. Easier to debug, less flexible.
Blackboard. Shared state repository, agents read and write to it. Suitable for async parallel work.
For most mobile products—Supervisor with 2–3 specialized agents on backend sufficient.
Communication Between Agents: What to Pass
Agents communicate via structured messages, not raw text. Why matters: if Research Agent returns unstructured text, Action Agent may misinterpret. Use JSON contracts:
{
"agent": "research",
"task_id": "trip-2024-warsaw",
"status": "completed",
"result": {
"flights": [
{"id": "LOT123", "price": 189, "departure": "2024-04-10T06:30"}
],
"hotels": [
{"id": "H456", "name": "Marriott Warsaw", "price_per_night": 95}
]
}
}
Orchestrator knows each agent's contract, doesn't rely on LLM "understanding."
State Management on Mobile Client
Multi-agent process may take 30–120 seconds. Mobile UI must:
- Show current active agent and its step
- Provide option to cancel anytime
- Continue on app minimize (push on completion)
- On one agent error—show partial result
Android: WorkManager for background orchestration + LiveData/StateFlow for UI updates. iOS: BackgroundTasks framework + Combine/AsyncStream.
WebSocket or Server-Sent Events for real-time step updates—better than long polling. Client subscribes to task_id, receives events:
event: agent_step
data: {"agent": "research", "step": "Searching Minsk→Warsaw flights", "progress": 0.3}
event: agent_step
data: {"agent": "action", "step": "Booking flight LOT123", "progress": 0.7}
event: task_complete
data: {"task_id": "trip-2024-warsaw", "result": {...}}
Agent Context Isolation
Each agent must have minimal context—only what's needed for its task. Don't pass Research Agent booking tool info, and vice versa. Less context—fewer "hallucinations," cheaper call.
Critical: Critic Agent gets only final result and verifies by checklist (dates valid, amount matches selected options, no contradictions). Last barrier before showing user.
Cost and Optimization
Multi-agent system multiplies LLM calls. To optimize:
- Specialized agents use cheaper models (GPT-4o-mini, Claude Haiku) for routine tasks
- Orchestrator and Critic—more powerful models (GPT-4o, Claude Sonnet)
- Cache Research Agent results on similar repeated requests (semantic caching)
Stages and Timeline
Design agent topology for task → define exchange contracts → implement each agent and Orchestrator → integrate server orchestrator → WebSocket protocol for client → mobile progress UI → test failures and partial results.
Multi-agent system from 3 agents with mobile UI—6–10 weeks.







