Implementing AI Agent with Actions in Mobile Application (Booking, Order)
Agent that doesn't just answer but acts—different responsibility level. Book table, place order, transfer money. Error in such agent is not "wrong text," it's real action with real consequences. That's why architecture of action-executing agents fundamentally differs from informational.
Human-in-the-Loop: Why It's Mandatory
No production agent should execute irreversible actions without explicit user confirmation. This is not overcaution—it's rule. GPT-4o sometimes interprets "I'd like to order pizza" as direct action command, not desire expression.
Correct UX: agent collects all parameters → shows summary → waits for confirm from user → executes. On mobile, realized via special tool request_confirmation:
// Confirmation tool—doesn't execute action, requests permission
data class ConfirmationRequest(
val action: String, // "book_restaurant"
val summary: String, // "Table for 2 at Café Minsk, March 26 19:00"
val details: Map<String, Any> // all parameters for display
)
// Agent calls this tool last before action
// Client shows Bottom Sheet with details and "Confirm" button
Model must not be able to "skip" confirmation. System prompt explicitly: "Before any booking or order, MUST call request_confirmation tool and wait for user response."
Idempotency and Duplicate Prevention
User clicked "Confirm," connection broke, app doesn't know—order executed or not. Without idempotency—double order. Solution: generate idempotency_key (UUID) before first send and transmit with each retry:
// iOS—create key once, save until execution confirmed
let idempotencyKey = UUID().uuidString
UserDefaults.standard.set(idempotencyKey, forKey: "pending_order_key")
// Send in header or body request to backend
request.setValue(idempotencyKey, forHTTPHeaderField: "Idempotency-Key")
Most payment systems (Stripe, YooKassa) support Idempotency-Key natively. For own backend, store keys in Redis with 24-hour TTL and return cached result for retries.
Rollback and Partial Failure Handling
Scenario: agent booked flight but hotel didn't respond. Must either cancel flight or notify user of partial success. Saga pattern: each action has compensating action (cancellation). Agent must know these and be able to call.
{
"name": "cancel_flight_booking",
"description": "Cancels previously executed flight booking. Use ONLY on explicit user request or when subsequent booking steps fail.",
"parameters": {
"booking_id": {"type": "string", "description": "Booking ID from search_flights result"}
}
}
Order State and Offline Queue
Action may take time—external booking system sometimes responds in 3–10 seconds. On mobile means progress indicator with current step description ("Checking seat availability..."), not just spinner.
If app was minimized during execution—use WorkManager on Android for background task with status, or BGTaskScheduler on iOS. User gets push with result.
Save agent state locally in Room/Core Data: current step, parameters, booking IDs. On app restart—restore state and option to continue or cancel.
What to Test Mandatory
- Double "Confirm" click (race condition)
- External API timeout on payment step
- One service fails while other succeeds (partial transaction)
- Price or availability change between search and booking
- Agent attempting action without confirmation (adversarial prompts)
Workflow Stages
Analyze business processes, define "dangerous" actions → design human-in-the-loop for each → implement idempotency and compensating actions → agent cycle with state management → confirmation and progress UI → integration testing edge cases → load tests.
Timeline: agent for one action type (e.g., restaurant booking only)—3–4 weeks. Multi-domain agent (flight + hotel + transfer)—6–10 weeks.







