Healthcare RAG

Healthcare RAG over FHIR

RAG works well when the retrieval layer respects the same access model the application does. For healthcare that means role-aware retrieval, embedding indexes scoped per role or filtered at query time, and an audit record that captures the documents the model actually saw.

Read the agentic-LLM whitepaper Book a RAG architecture review

What you can build

Retrieval respects the access model
Each retrieval call resolves through the same rule chain a regular FHIR read does. Documents the role cannot read are not in the result set.
Embedding indexes do not leak across roles
Either per-role index scopes or query-time filtering keeps the embedding store from becoming a shadow database that ignores access rules.
Audit captures retrieval, not just generation
The audit log records the documents fed into the model. Reviewers can reproduce the context the model worked from.

Who this is for

AI engineering teams building retrieval-augmented assistants on top of clinical data, and security architects evaluating the leakage surface of a RAG deployment.

Clinical applicability

A clinician assistant retrieves recent discharge summaries, lab reports, and imaging notes for the patient on the screen. The retrieval is scoped by the clinician's role; the model never sees notes for patients outside the care team.

Why naive RAG breaks the access model

A typical RAG pipeline embeds documents into a vector index and retrieves top-k matches at query time. If the index ignores the access model, a query in the context of a restricted user can still retrieve documents from anywhere in the index. The model sees what the index knows, not what the user is allowed to see.

The fix is to push the access model into the retrieval layer. Either the index is partitioned by role (each role queries its own index), or the retrieval layer filters results against the user's role before returning them, or the documents are tagged with access metadata that the retrieval layer respects.

Role-aware retrieval over FHIR

The access metadata for a FHIR resource is the rule chain that would govern a read. A retrieval layer over FHIR documents can run candidate matches through the same rule chain Fire Arrow would apply to a direct read, and drop the candidates that fail.

DocumentReference is the natural FHIR resource for unstructured clinical content. Indexing the embedding alongside the DocumentReference identifier (and not the document content directly) keeps the access decision authoritative on the Fire Arrow side. The model gets the document only when the role permits it.

Audit, evaluation, and reproducibility

RAG pipelines are easier to evaluate when the retrieval is auditable. The audit log records which DocumentReferences the model received in context for a given session, which makes it possible to reproduce a model output and to investigate model errors against the documents that produced them.

From a compliance angle the same record satisfies the access-log requirement: the user (and the model on their behalf) saw these documents at this time, under this role, through this rule.

Related docs

FAQ

Where do I store embeddings?

Either alongside FHIR resources in a managed extension or in a separate vector store keyed by the FHIR resource identifier. The choice depends on operational preferences; the access model is enforced through the FHIR layer either way.

Can I index notes that contain identifying data?

Yes if the embedding store sits inside the same access boundary as the source data and the access decisions are made through the FHIR layer. If the embedding store is queryable independently, the access boundary is broken.

How do I prevent the model from learning across roles?

Per-role retrieval keeps the model's context scoped per session. The model itself does not retain context across sessions unless you fine-tune it; if you do fine-tune, do so on data that is appropriate for the deployment scope of the resulting model.

What about retrieval over external sources?

External sources (medical literature, drug databases) sit outside the access model. They can be retrieved freely; the access model concern is the patient context, not the medical knowledge.

How do I evaluate retrieval quality?

Standard RAG evaluation methods work, with the addition of an access-correctness check: did the retrieval drop the documents the role should not have seen? The audit log makes this verifiable.