LLM access control

LLM Access Control for Healthcare

An LLM in the request path is a non-human caller with the linguistic flexibility to construct unusual queries. The access model has to assume that flexibility and stay restrictive in spite of it. Fire Arrow gives the LLM (or the runtime hosting it) a scoped identity, an explicit tool surface, and an audit chain that captures both the model and the human it acts for.

What you can build

  • The LLM gets a real identity, not a shared key

    Service accounts with rotation, scope, and audit. No shared API keys hidden inside the runtime.

  • On-behalf-of is explicit, not implicit

    The human identity flows into the session as an additional claim. The role used for adjudication is the human's role, not the model's.

  • Audit covers both identities

    Each request records the model's service identity and the on-behalf-of identity. Reviewers can reconstruct who saw what and through which session.

Who this is for

Security architects, AI engineering leads, and compliance reviewers evaluating the access boundary when an LLM-based assistant is part of a clinical workflow.

Clinical applicability

A clinician inbox assistant uses an LLM to summarize messages and draft replies. The LLM runtime authenticates as a service identity, inherits the clinician's role for the session, and can only see resources inside the clinician's care team scope.

Three access patterns for LLM workflows

Pattern 1: Foreground assistant. The LLM acts on behalf of a human user during an interactive session. The LLM runtime authenticates as a service identity; the user's identity flows into the session. The role used for each request is the user's role. Audit captures both.

Pattern 2: Background agent. The LLM acts as itself for batch or scheduled work. The role is the agent's own role, narrower than any human role. Foreground agents and background agents typically have different rule sets.

Pattern 3: Workflow step. The LLM is one node in a multi-step pipeline, called by an upstream service. Identity flows through the pipeline as a chain; each step's audit records its own identity and the originating user where applicable.

What the access model has to handle

An LLM constructs queries from natural-language intent. It will produce search parameters a human would not type, _include and _filter combinations a UI would not generate, and reference traversals that look benign individually but combine into broad reads. The access model has to assume this and remain narrow despite it.

Fire Arrow's standard rule features handle this directly: search-parameter blocklists prevent specific parameters on a per-role basis, property filters strip unauthorized fields before they leave the server, and identity filters can scope a rule by FHIRPath expression on the caller's identity. The model gets the data the role permits and nothing else.

Audit chain of custody

Each authorized request emits an audit record with the resolved FHIR identity (the role-bearing identity used for adjudication), the operation, the resource, and the rule that matched. When an on-behalf-of identity is present, the record includes both.

For multi-step pipelines the audit chain is reconstructed by combining the records from each step. The session identifier links them so a reviewer can trace a model output back through the calls that produced it.

FAQ

Can I use a single API key for all LLM calls?

Technically yes, but the audit chain collapses. Per-session service identities (or on-behalf-of identity flow) keep the access record tied to the originating user, which is what the audit needs to be useful.

How does on-behalf-of work in practice?

The LLM runtime obtains a session token from Fire Arrow that carries both its service identity and the user's identity. The role used for adjudication is the user's role; the audit log captures both. Token issuance is done through the standard authentication flow.

What about prompt-injection attacks?

Prompt injection is a threat to the model and the application logic, not to the access model. A successful injection could cause the model to construct unusual queries; the access model assumes those queries can happen and is narrow enough to limit damage. Property filters and search-parameter blocklists are the relevant defenses.

Can I rate-limit the LLM separately from human traffic?

Yes. Service-identity-based rate limits sit on the deployment side. The LLM identity has its own quota; human users have theirs.

Where does the foreground vs background split actually matter?

In the rule set. Foreground agents inherit human roles and the corresponding human-style rules. Background agents need their own role with rules built for the kind of work they do (a summary-generation role differs from a clinician role even if both touch Encounter resources).