HIPAA-Compliant RAG: Building AI That Auditors Love

You're building an AI-powered patient portal. The chatbot helps patients understand their test results, schedule appointments, and answer questions about their care.

Then your HIPAA Security Officer sends a message:

"OCR just released new guidance on AI systems handling PHI. We need to demonstrate that our chatbot maintains an audit trail of all PHI access. Can you show me what patient data the AI accessed for any given interaction?"

You check your LangChain implementation. There is no audit trail.

The HIPAA Challenge for AI

HIPAA's Security Rule requires "audit controls" — you must track who accessed what PHI and when. When AI is involved, this gets complicated:

HIPAA Requirement	Traditional Software	AI/RAG Systems
Access logging	Log database queries	What about context assembly?
Minimum necessary	Return only needed fields	How much context did the LLM see?
Audit trail	User + timestamp + action	Which documents were retrieved?
Breach notification	Know what was exposed	What PHI was in the prompt?

The core problem: When a RAG system retrieves patient records to answer a question, you need to log exactly what PHI the AI had access to — not just that "the AI responded."

What Auditors Actually Ask

Based on OCR guidance and recent audits, here's what healthcare IT teams face:

1. "Show me what patient data this AI accessed"

If a patient complains their chatbot response referenced information they didn't expect the AI to know, auditors want:

What documents were retrieved?
What patient data was in the context?
Was access appropriate for the query?

With most RAG frameworks: "We logged that a response was generated."

With Fabra: Complete lineage:

ctx = await patient_support_context(patient_id, query)

# Auditors get:
print(ctx.id)       # "01912345-6789-7abc-def0-123456789abc"
print(ctx.lineage)  # Every feature and document that was retrieved

2. "Can you prove minimum necessary?"

HIPAA's "minimum necessary" standard requires that you only access PHI necessary for the task. For AI systems, this means proving you didn't over-retrieve.

With Fabra: Token budgets enforce minimum necessary:

@context(store, max_tokens=2000)  # Hard limit on context size
async def patient_context(patient_id: str, query: str) -> list[ContextItem]:
    # Only retrieve what's needed
    relevant_records = await search_patient_records(query, patient_id)
    return [
        ContextItem(content=record, priority=i)
        for i, record in enumerate(relevant_records)
    ]

ctx = await patient_context(patient_id, query)
print(ctx.meta["token_usage"])  # Prove you stayed within limits

3. "What if there's a breach?"

If your AI system is compromised, can you determine what PHI was exposed? You need to reconstruct every context that was assembled.

With Fabra: Context replay for breach assessment:

# Retrieve all contexts for a time period
contexts = await store.list_contexts(
    start="2024-01-01",
    end="2024-01-31",
    filter={"patient_id": affected_patient_id}
)

# For each context, see exactly what PHI was included
for ctx in contexts:
    historical = await store.get_context_at(ctx.id)
    print(f"Context {ctx.id}: {historical.lineage}")

Building HIPAA-Compliant RAG

Here's a complete implementation pattern:

from fabra import FeatureStore, context, ContextItem
from fabra.retrieval import retriever

store = FeatureStore()

# 1. Define patient data retrieval with lineage
@retriever(index="patient_records", top_k=5)
async def search_patient_records(query: str, patient_id: str):
    """Search only this patient's records."""
    pass  # Auto-wired to filtered vector search

# 2. Assemble context with minimum necessary limits
@context(store, name="hipaa_patient_context", max_tokens=2000)
async def patient_support_context(
    patient_id: str,
    query: str
) -> list[ContextItem]:
    # Get patient features (all retrievals tracked)
    name = await store.get_feature("patient_name", patient_id)
    care_team = await store.get_feature("care_team", patient_id)

    # Search relevant records (only this patient's)
    records = await search_patient_records(query, patient_id)

    return [
        ContextItem(
            content=f"Patient: {name}",
            priority=0,
            required=True
        ),
        ContextItem(
            content=f"Care Team: {care_team}",
            priority=1
        ),
        ContextItem(
            content=f"Relevant Records:\n{records}",
            priority=2
        ),
    ]

# 3. Log every interaction with audit trail
async def handle_patient_query(patient_id: str, query: str):
    ctx = await patient_support_context(patient_id, query)
    response = await call_llm(ctx.content)

    # HIPAA audit log
    hipaa_audit_log.info(
        "phi_access",
        context_id=ctx.id,  # Full audit trail via this ID
        patient_id=patient_id,
        user_id=get_current_user(),
        access_reason="patient_portal_query",
        phi_categories=extract_phi_categories(ctx.lineage),
        token_count=ctx.meta["token_usage"],
    )

    return response

Freshness for Clinical Data

In healthcare, data freshness can be critical. A patient's medication list from last week might not reflect a new prescription.

@context(store, max_tokens=2000, freshness_sla="1h")  # Data must be <1 hour old
async def medication_context(patient_id: str, query: str) -> list[ContextItem]:
    # These features must be fresh
    current_meds = await store.get_feature("current_medications", patient_id)
    allergies = await store.get_feature("allergies", patient_id)

    return [
        ContextItem(content=f"Current Medications: {current_meds}", priority=0, required=True),
        ContextItem(content=f"Allergies: {allergies}", priority=0, required=True),
    ]

ctx = await medication_context(patient_id, query)

if not ctx.is_fresh:
    # Don't proceed with stale clinical data
    return "Please contact your care team directly for the most current information."

Export for OCR Audits

When OCR comes knocking:

# Export all AI interactions involving PHI for audit period
curl -fsS "http://127.0.0.1:8000/v1/contexts?name=hipaa_patient_context&start=2024-01-01T00:00:00Z&end=2024-12-31T23:59:59Z&limit=10000" \
    > phi_ai_audit_2024.json

# Export specific interaction with full lineage
fabra context export 01912345-6789-7abc-def0-123456789abc \
    --format json

The Breach Scenario

Let's walk through a breach notification scenario:

# Scenario: Unauthorized access detected for 48-hour window
breach_start = "2024-03-15T00:00:00Z"
breach_end = "2024-03-16T23:59:59Z"

# 1. Find all contexts assembled during breach window
affected_contexts = await store.list_contexts(
    start=breach_start,
    end=breach_end
)

# 2. For each context, identify affected patients and PHI
affected_patients = set()
phi_accessed = []

for ctx_meta in affected_contexts:
    ctx = await store.get_context_at(ctx_meta.id)

    # Extract patient ID from lineage
    patient_id = ctx.lineage.get("patient_id")
    affected_patients.add(patient_id)

    # Document what PHI was in the context
    phi_accessed.append({
        "patient_id": patient_id,
        "context_id": ctx.id,
        "timestamp": ctx_meta.created_at,
        "features_accessed": ctx.lineage.features_used,
        "documents_accessed": ctx.lineage.retrievers_used,
    })

# 3. Generate breach notification report
print(f"Affected patients: {len(affected_patients)}")
print(f"Total PHI access events: {len(phi_accessed)}")

This is exactly what HIPAA breach notification requires: specific identification of what PHI was accessed.

HIPAA Compliance Checklist

Requirement	Without Fabra	With Fabra
PHI access logging	Manual, incomplete	Automatic lineage tracking
Minimum necessary	Unverified	Token budgets + audit
Audit trail	Generic logs	`get_context_at(id)`
Breach assessment	Guess what was exposed	Full context replay
Data freshness	Unknown	Freshness SLAs

Getting Started

pip install "fabra-ai[ui]"

from fabra import FeatureStore, context, ContextItem

store = FeatureStore()

@context(store, name="hipaa_context", max_tokens=2000)
async def build_context(patient_id: str, query: str) -> list[ContextItem]:
    # Your HIPAA-compliant implementation
    pass

# Every call is automatically audit-ready
ctx = await build_context(patient_id, query)
print(f"Audit ID: {ctx.id}")

Full Compliance Guide → | Context Accountability Docs →

Also Need ML Features?

Fabra includes a Feature Store for serving ML features — risk scores, patient similarity, clinical predictions. Same infrastructure, same audit trail.

Feature Store → | Quickstart →

You Might Also Search For

"HIPAA AI compliance requirements"
"healthcare LLM audit trail"
"PHI in AI prompts compliance"
"patient chatbot HIPAA"
"medical AI explainability"