Context
A regional healthcare group operates 14 outpatient clinics across three states — general practice, physiotherapy, and sports medicine. They have roughly 60 front desk staff across all locations and receive around 380 inbound calls per day.
A call audit (conducted by listening to a 10% random sample of recorded calls) showed:
| Call Type | Share | Avg Handle Time |
|---|---|---|
| New appointment booking | 38% | 4.2 min |
| Appointment rescheduling | 21% | 3.8 min |
| Appointment cancellation | 13% | 2.1 min |
| Directions / parking | 8% | 1.8 min |
| Insurance questions (basic) | 6% | 3.1 min |
| Medical question (nurse required) | 7% | 8.4 min |
| Other / complex | 7% | 6.2 min |
The first five categories — totalling 86% of volume — are information and transactional tasks. They require no clinical judgement. They do require human staff in the current setup.
The problems this was causing:
Abandonment. During peak morning hours (8–11am), average hold time was 6–8 minutes. An estimated 22% of callers hung up before connecting. Many were new patients booking their first appointment — a direct new patient acquisition loss.
After-hours blackout. Clinics closed at 6pm. Any patient wanting to book outside those hours hit an answering machine. Most didn’t leave a message. They called a competitor or used an online booking tool for a different provider.
Capacity bottleneck. The front desk team was capable of handling intake, check-in, insurance verification, and clinical coordination. But scheduling calls ate 60% of their day, leaving those higher-value tasks rushed or ignored.
Challenge
Challenge 1: Natural conversation flow. Appointment scheduling involves conditional dialogue — asking what service is needed, checking availability, handling preference negotiation (“I need a morning slot, Tuesday doesn’t work”), and confirming details. The voice agent needed to handle this naturally, not feel like an IVR menu.
Challenge 2: EHR integration. The group used a legacy Electronic Health Record (EHR) system with a REST API that had limited documentation. Real-time availability lookups and booking writes had to be reliable under load.
Challenge 3: Graceful handoff. Any call that the agent couldn’t handle — clinical questions, complex insurance disputes, distressed patients — had to transfer to a human immediately and seamlessly. Patients should not be left in limbo.
Challenge 4: Compliance. Healthcare communications have specific requirements around call recording consent, PHI handling, and state-specific regulations on automated systems. The entire stack had to be built with HIPAA considerations from the start.
Action
Technology Stack
| Component | Technology | Role |
|---|---|---|
| Voice AI Platform | Vapi | Call handling, STT, TTS routing, webhooks |
| Voice Model | ElevenLabs (custom voice) | Natural-sounding TTS — cloned from a staff member with consent |
| Conversation LLM | Claude (claude-opus-4-5) | Intent understanding, slot extraction, conversation logic |
| Scheduling Logic | Custom Node.js service | Availability lookup, booking write, conflict handling |
| EHR Integration | REST API adapter | Real-time read/write to legacy system |
| Compliance Layer | Call recording consent flow | State-specific disclosures + opt-out handling |
The Conversation Flow Design
We mapped the full scheduling conversation as a state machine before writing a single line of code. The key insight: scheduling conversations always follow a predictable information-gathering sequence, but the order and phrasing vary wildly.
PATIENT CALL RECEIVED
│
▼
[1] GREETING + INTENT DETECTION
"Thanks for calling [Clinic]. This is Aria.
How can I help you today?"
Intent: schedule_new | reschedule | cancel |
info_request | clinical → handoff
│
▼
[2] SERVICE IDENTIFICATION (if scheduling)
"What type of appointment are you looking for?
We offer general practice, physiotherapy, and sports medicine."
│
▼
[3] PATIENT LOOKUP
"Can I get your date of birth and last name?"
→ Existing patient: pull record, use name
→ New patient: capture basic details, create placeholder record
│
▼
[4] AVAILABILITY NEGOTIATION
"Our next available physiotherapy slot is [date] at [time].
Does that work for you?"
Handle: "Not that day", "Need morning", "ASAP please"
│
▼
[5] CONFIRMATION + REMINDERS
Confirm details read back, SMS confirmation sent,
reminder preference captured
│
▼
[6] CLOSING
"Is there anything else I can help you with?
We'll see you on [date]. Have a great day."
The Prompt Architecture
We used a system prompt that gives Claude the full state context on every turn, rather than trying to implement a rigid FSM:
You are Aria, a scheduling assistant for [Healthcare Group].
You are on a phone call. Respond naturally and conversationally.
Current conversation state:
- Intent identified: {{intent}}
- Patient: {{patient_name || "unknown"}}
- Service requested: {{service || "not yet identified"}}
- Availability shown: {{availability_option || "not yet shown"}}
- Appointment booked: {{booked || false}}
Available appointments (DO NOT fabricate others):
{{availability_json}}
Rules:
- Never ask for more than one piece of information per turn
- If the patient asks a clinical question, say: "That's a great question for one of our
clinicians — let me transfer you now" and set action: handoff_clinical
- If you detect distress or urgency, set action: handoff_urgent immediately
- If the patient wants to speak to a human, set action: handoff_human — no exceptions
- Confirm all appointment details aloud before finalising
- Never mention you are an AI unless directly asked — if asked, confirm honestly
Return: { "response": "...", "action": null | "book" | "handoff_human" |
"handoff_clinical" | "handoff_urgent", "slots": {...} }
EHR Integration
The EHR’s REST API required OAuth 2.0 authentication and had a 15-request-per-minute rate limit per clinic. We built an availability cache that refreshes every 90 seconds per clinic, reducing live API calls to only the booking write:
// Availability lookup — uses cache, refreshes on miss or staleness
async function getAvailability(clinicId, serviceType, preferences) {
const cacheKey = `avail:${clinicId}:${serviceType}`;
const cached = await redis.get(cacheKey);
if (cached && Date.now() - cached.timestamp < 90_000) {
return filterByPreferences(cached.slots, preferences);
}
// Cache miss — fetch from EHR
const slots = await ehrClient.getAvailableSlots({
clinic: clinicId,
service: serviceType,
from: new Date(),
days: 14,
});
await redis.set(cacheKey, { slots, timestamp: Date.now() });
return filterByPreferences(slots, preferences);
}
Compliance Handling
At the start of every call, before any PHI collection:
“This call may be recorded for quality and training purposes. If you prefer not to be recorded, say ‘opt out’ at any time and I’ll note that preference.”
All PHI collected during calls is encrypted in transit and at rest. Call recordings are stored with access controls and auto-delete after 90 days per HIPAA minimum necessary standards.
Handoff Protocol
When the agent triggers a handoff — for clinical questions, distress, or explicit human request — Vapi initiates a warm transfer. Before connecting:
“Of course — let me connect you with one of our team members right now. They’ll be able to help you directly. Please hold for just a moment.”
Hold time: typically 8–15 seconds. The receiving staff member sees a screen-pop with the conversation summary and any slots that were in negotiation.
Result
Six months post-launch across all 14 locations:
The voice agent handled 72% of all inbound calls without human involvement — closely matching the predicted 86% automatable volume minus a buffer for edge cases and patient preference for human interaction.
After-hours bookings increased 340%. The agent handles calls 24/7. Patients who previously reached voicemail at 6:01pm can now book. This has been the single most remarked-upon change in patient satisfaction surveys.
New patient wait time fell from 11 days to 4 days. This is a counterintuitive result: the reduced wait time wasn’t from adding capacity — it was from freeing up front desk staff to actively manage the schedule, fill same-day cancellations, and handle recall outreach for overdue patients. When the scheduling queue is handled by the agent, staff can be proactive rather than reactive.
Estimated $140k/year in front desk labor savings — calculated as the reduction in hours spent on scheduling calls (68% of previous call volume × average handle time × blended hourly rate across 14 locations).
One material learning: the agent’s handoff to human for clinical questions was occasionally too aggressive in early testing — flagging questions about appointment prep instructions as “clinical” when they were really administrative. We refined the classification prompt with 40 real examples of each category. The false-positive handoff rate dropped from 14% to 3% in the first 30 days.
Voice AI for appointment scheduling is one of the highest-ROI AI applications available right now — the ROI is well-defined, the conversation flows are bounded, and the 24/7 availability unlock is genuinely transformative for businesses that close at 5 or 6pm. The most common mistake is building a glorified IVR. The key is a conversational model that can handle the messy, non-linear reality of how patients actually talk.