NAIC AI Model Bulletin: What Insurance Underwriting AI Must Document
The NAIC Model Bulletin on the Use of Artificial Intelligence Systems by Insurers establishes five principles for insurer AI use. Principles 2 through 6 require accountability documentation, disparate impact testing data, decision-level adverse action explanations, ongoing behavioral monitoring evidence, and oversight controls for third-party AI vendors. Model documentation and aggregate performance metrics do not satisfy these requirements. Per-decision audit records do.
NAIC AI Model Bulletin Scope
The NAIC Model Bulletin on AI establishes five principles applicable to insurer use of AI in underwriting, pricing, claims, and customer service. As of 2026, the majority of US states have incorporated the bulletin's principles into market conduct examination frameworks. Principle 2 (Accountability) requires named roles with documented oversight activity. Principle 3 (Compliance) requires regular disparate impact testing with documented methodology and results. Principle 4 (Transparency) requires decision-level explanations for adverse outcomes. Principle 5 (Risk Management) requires baseline measurement, ongoing monitoring, and drift detection documentation. Principle 6 (Third-Party AI Governance) requires that insurers maintain oversight and records even for vendor-supplied AI systems.
Principle 3: Disparate Impact Testing Data
Disparate impact testing for insurance AI requires a decision-level dataset. The analysis compares approval rates, premium levels, and coverage terms across demographic groups using rating factor data as proxies where direct demographic data is unavailable. Without per-decision records capturing the inputs used for each underwriting evaluation, insurers cannot conduct the statistical analysis state examiners will request during market conduct examinations. Insurers that maintain only aggregate model metrics cannot respond to examination requests for stratified decision samples.
Principle 4: Decision-Level Explanation
NAIC Principle 4 distinguishes model-level transparency (how the model generally works) from decision-level transparency (why this specific applicant received this outcome). Adverse action notices under state unfair trade practices acts require factor-level explanation: the specific rating factors that contributed to the adverse decision, their values for this applicant, and how they affected the outcome. Generating factor-level explanations from decision records is deterministic and auditable. Generating them post-hoc from a black-box model is unreliable and produces explanations that may not match the actual decision basis.
Principle 5: Behavioral Drift Monitoring
NAIC Principle 5 requires post-deployment monitoring for model drift with documented evidence. For insurance AI, relevant behavioral indicators include: approval rate drift by product line and geography, coverage tier distribution shifts, declination rate patterns by ZIP code (redlining signal), override rate increases by product (indicator of systematic AI errors), and model version provenance tracking. Infrastructure metrics (latency, errors, uptime) confirm the system is running — they do not confirm it is producing compliant decisions. Behavioral monitoring from decision records is required.
Implementation for NAIC Compliance
Configure TenetClient with policy_version and system_id to attach documented control metadata to each decision record. Capture all rating factors in ctx.snapshot_context() to create the disparate impact test dataset. Include factor-level explanation in ctx.decide() for adverse action notice generation. Record underwriter reviews with tenet.record_override() and tenet.record_confirmation() to satisfy Principle 2 accountability documentation. Configure anomaly detection with approval_rate_shift and geographic_pattern thresholds to satisfy Principle 5 continuous monitoring requirements.