Voice agent

Voice Agent on FHIR

A voice agent captures patient input through conversation and turns it into structured data the rest of the system can act on. The right destination shape for that data is a QuestionnaireResponse: typed, validated, and inside the access model the rest of the deployment uses.

Read the questionnaire surveillance whitepaper Book a voice-agent review

What you can build

Voice-derived data lands in a typed FHIR resource
QuestionnaireResponse with item links back to the originating Questionnaire. No bespoke schema for voice intake.
Validation runs server-side
The Questionnaire's items, value types, and required flags drive validation on the FHIR backend. The agent does not have to invent its own validation.
Confidence scoring stays alongside the answer
Per-item confidence and source-audio reference can be attached as extensions, supporting clinician review.

Who this is for

Product teams building voice-first patient experiences (intake, screening, follow-up) and clinical informatics leads evaluating where the voice-derived data should land.

Clinical applicability

A voice agent collects pre-visit intake from a patient over the phone. The conversation produces a QuestionnaireResponse linked to the Patient and to the upcoming Encounter. The clinician reviews the structured answers in the EHR before the visit.

Why QuestionnaireResponse is the right shape

QuestionnaireResponse is FHIR's structure for collected, item-by-item answers. It links back to a Questionnaire that defines the questions, value types, and required flags, which gives the voice agent something concrete to validate against.

Voice agents that store free-text transcripts in DocumentReference miss this. The transcript is useful for review but not directly actionable; the structured QuestionnaireResponse is what the rest of the system queries against, drives clinical decision support from, or surfaces to the clinician at the point of care.

Validation and partial responses

Validation against the Questionnaire happens on the Fire Arrow side. The agent submits a QuestionnaireResponse, the server validates against the linked Questionnaire's constraints, and the agent gets a structured error if items are missing or values do not match expected types.

Partial responses are supported: the QuestionnaireResponse.status field carries in-progress, completed, amended, or stopped. A voice agent that drops mid-call can submit an in-progress response that the clinician (or a subsequent call) completes.

Confidence scoring and provenance

Per-item confidence (from the speech-to-text or NLU layer) is useful information for a reviewing clinician. It can be attached as an extension on QuestionnaireResponse.item or recorded in a Provenance resource referencing the QuestionnaireResponse.

Source-audio references work similarly. A reviewer can replay the segment that produced a particular answer when the confidence score is low, without surfacing the audio in the routine clinical view.

Related docs

FAQ

Where is the audio stored?

The audio is typically held by the speech infrastructure, often in a region-pinned audio store, and referenced from the QuestionnaireResponse or Provenance via extension. Whether the audio is retained beyond the session is a deployment decision.

What if the patient says something the Questionnaire does not have an item for?

The agent can capture it as an additional free-text item in the QuestionnaireResponse, or as a Communication resource linked to the same Encounter. The clinician sees both during review.

How does the agent handle ambiguous answers?

Either by re-prompting in the conversation or by submitting the QuestionnaireResponse with a low-confidence flag for clinician review. Both patterns work; the choice depends on the use case and on how interruptive the re-prompt would be.

Can multiple agents call the same backend?

Yes. Each agent runs as its own service identity. Identity filters can scope each agent to specific Questionnaires or specific patient cohorts.

What about authentication for the patient on a voice channel?

Voice authentication runs in the agent's session layer (callback verification, voice biometric, knowledge-based). The agent submits the QuestionnaireResponse on the verified patient's behalf using the on-behalf-of identity flow.