AI Agent Auditability & Decision Ledger for Healthcare

Tenet AI is the decision ledger platform for clinical AI agents. It captures every care recommendation, replays clinical decisions deterministically, and generates audit trails for HIPAA 45 CFR 164.312, FDA Software as a Medical Device, and EU AI Act compliance — in 2 lines of code.

Why Healthcare Teams Use Tenet AI

Clinical AI agents making diagnostic recommendations, prior authorization decisions, treatment pathway suggestions, and care escalation determinations operate in the highest-scrutiny environment for AI deployment. Patient safety regulators, OCR investigators, and CMS auditors all expect contemporaneous decision records — not retroactively assembled logs. The challenge is that standard LLM observability tools capture traces, not decisions. A trace tells you what the model processed; a decision record tells you what clinical determination was made, which guideline version applied, which patient data drove the outcome, and what the agent chose among available options. These are different evidence layers. HIPAA 45 CFR 164.312(b) requires audit controls that record and examine activity in information systems containing electronic protected health information. For clinical AI, that means a record for every recommendation, not just every API call. Tenet captures full clinical decision provenance via Ghost SDK — under 0.3ms blocking overhead, with ePHI kept inside your VPC. No restructuring of existing agent architecture required. You get HIPAA-ready audit controls in 2 lines of code.

Compliance Coverage for Healthcare AI

The compliance landscape for clinical AI touches four distinct regulatory frameworks simultaneously. HIPAA 45 CFR 164.312(b) Technical Safeguards require audit controls for every AI system touching electronic protected health information — this is the foundational obligation. The FDA distinguishes AI/ML-Based Software as a Medical Device (SaMD) from clinical decision support that is not a device; for AI/ML SaMD, post-market monitoring obligations require documented records of how the software performs in production. EU AI Act Annex III Category 5 covers AI used for access to essential healthcare services, making clinical AI affecting patient care a high-risk system with full documentation, logging, and human oversight obligations under Articles 11, 12, and 14. SOC 2 Type II CC7.2 (anomaly detection) and CC4.1 (change management) require evidence of AI behavioral monitoring and model version control throughout the audit period. ISO 42001 Annex A controls A.9 (performance monitoring) and A.10 (corrective actions) require continuous decision quality measurement with documented review. Tenet satisfies all five frameworks from a single SDK integration, generating the specific evidence artifacts each framework requires without separate tooling for each regulator.

Prior Authorization AI and HIPAA Audit Requirements

Prior authorization automation is the highest-scrutiny AI use case in healthcare right now. CMS finalized the Interoperability and Prior Authorization Final Rule (CMS-0057-F) in January 2024, requiring payers to implement prior authorization APIs and reduce response times to 72 hours for non-urgent requests. State insurance departments are examining PA AI systems specifically for adverse action documentation compliance — did the AI capture the specific clinical criteria that drove each determination? Every denial, partial approval, and escalation decision must be explainable and reproducible. Not in aggregate — individually. When a patient appeals a prior authorization denial, the payer must produce the specific clinical evidence the AI evaluated, the guideline version applied at the time of the decision, and the reasoning chain that produced the outcome. When an OCR investigator examines a HIPAA complaint tied to a PA determination, they request the audit log for that specific decision. Standard PA workflow logs capture that the AI processed a request — they do not capture what it decided or why. Tenet captures the clinical context at authorization time: diagnosis codes evaluated, guideline version active, clinical criteria weighted, decision outcome, confidence level, and the human review step when a clinician overrides or confirms the AI recommendation. This creates the contemporaneous record that adverse action challenges and regulatory investigations require.

Clinical Decision Support and FDA AI Guidance

The FDA's AI/ML action plan distinguishes between locked decision support software (deterministic, fixed algorithm) and adaptive AI/ML-based SaMD (software that learns and changes behavior from real-world data). Most modern clinical AI agents are adaptive — they use LLMs, ensemble models, or continuously updated classifiers that change behavior as the underlying model evolves. For adaptive AI/ML SaMD, the FDA's predetermined change control plan (PCCP) framework requires manufacturers to describe the types of modifications anticipated, the methodology for implementing changes, and the monitoring protocol that verifies changes do not degrade safety or effectiveness. This monitoring obligation has teeth: the FDA expects post-market monitoring data for adaptive SaMD to include evidence of how the software's recommendations changed after a model update. Standard monitoring tools measure aggregate performance metrics. They cannot tell you whether a specific patient population's recommendations shifted after a model version change. Tenet's Verification Replay engine addresses this directly: it re-executes past clinical decision records against the updated model, generating a Semantic Diff that shows exactly which recommendations changed, for which patient types, and by how much — producing the PCCP monitoring evidence the FDA expects without requiring a separate clinical study.

ePHI Handling and On-Premise Deployment

Healthcare AI deployments routinely process ePHI in the decision context: diagnosis codes, medication history, lab values, prior authorization history, demographic data. Any AI vendor that receives ePHI — even transiently during inference — is a Business Associate under HIPAA and requires a Business Associate Agreement. The HIPAA Omnibus Rule (2013) made BAs directly liable for Security Rule violations, with the same penalty tiers as covered entities. Tier 4 willful neglect penalties reach $50,000 per violation with an annual cap of $1.9 million. For an AI system making hundreds of PA decisions daily, a logging failure creates violation exposure at each decision event. Tenet's on-premise VPC deployment eliminates the BA ePHI transmission risk entirely: the Ghost SDK and Reasoning Ledger are deployed inside your network perimeter, so ePHI used in clinical reasoning never traverses external infrastructure. Decision records are stored inside your VPC and queryable via internal API. This architecture satisfies HIPAA Physical Safeguard requirements (workstation security, device and media controls) and Technical Safeguard requirements (access controls, audit controls, integrity controls, transmission security) simultaneously — because the data never leaves your controlled environment.

Behavioral Drift in Clinical AI: The Silent Risk

Behavioral drift in clinical AI is when the reasoning behind recommendations changes gradually — without any code deployment, model update, or configuration change triggering the shift. Clinical AI agents can drift because the context window population changes (different patient types than training), because the upstream LLM provider silently updates a model, because fine-tuning data from recent cases shifts the distribution, or because prompt templates interact differently with updated model weights. For healthcare, drift creates two simultaneous risks: patient safety exposure (recommendations for a specific patient population shift before the change is detected) and compliance exposure (the documentation your audit trail captured no longer reflects how the AI actually operates). The detection problem is that aggregate metrics hide drift. An agent maintaining 94% clinical guideline adherence can simultaneously shift its recommendations for diabetic patients with comorbidities — a specific population — without any population-level metric catching it. You need decision-level comparison, not aggregate monitoring. Tenet's Verification Replay engine detects clinical decision drift by re-executing stored decision records against the current agent state, producing a Semantic Diff: exactly which reasoning steps diverged, for which patient types, and in what direction. No code changes required to detect or diagnose the drift — the Ghost SDK captures the baseline at decision time, and replay uses those captured records.