AI Behavioral Drift Detection: How to Know When Your LLM Agent Has Changed
LLM provider updates change agent behavior without notice. Behavioral drift detection compares current output distributions against deployment-time baselines to identify: semantic reasoning drift (cosine similarity of reasoning embeddings), decision rate drift (approval/rejection rate shifts), demographic performance drift (disparate impact ratio changes), confidence distribution drift, and tone/format drift. The same monitoring system satisfies EU AI Act Article 72 post-market monitoring, FINRA algorithm change management, and SR 11-7 ongoing model monitoring simultaneously.
Capturing Behavioral Baselines at Deployment
A behavioral baseline is the expected output distribution of an AI system at the time of a deliberate deployment event. Required baseline components: reasoning embedding distribution (sentence embeddings of reasoning fields from baseline decisions, providing the semantic space of explanations at deployment), decision rate distribution (proportion of each action type per decision_type at baseline), confidence distribution (mean, median, standard deviation of confidence scores), demographic decision rates (per-group rates where applicable), and model version (the exact API version active at baseline). The baseline drift threshold is set at mean centroid similarity minus 2 standard deviations — a current distribution more than 2σ below the baseline centroid triggers an alert. Without a baseline, drift cannot be detected and EU AI Act Article 72 post-market monitoring cannot be satisfied.
Semantic Drift Detection with Cosine Similarity
Semantic drift is the most important drift dimension for compliance: it detects when AI reasoning has changed independent of output labels. Method: encode current reasoning texts with a sentence transformer model, compute the centroid of current embeddings, compare against the baseline centroid using cosine similarity. When similarity drops below the baseline-derived threshold (mean − 2σ), semantic drift has occurred. This fires independently of decision rate drift — an AI can change its reasoning while maintaining the same approval rate, or change its approval rate while maintaining the same reasoning patterns. Both are compliance-relevant behavioral changes requiring investigation.
Decision Rate and Demographic Performance Monitoring
Decision rate drift is detected using statistical process control: compute rolling mean decision rates per action type over a 7-day window and alert when rates deviate beyond baseline ± threshold (typically 15 percentage points). Demographic performance drift is the highest-risk drift type: when approval rates for protected attribute groups diverge, it may indicate an emerging disparate impact violation requiring EU AI Act Article 10(3) re-examination, EU AI Act Article 72 investigation, or FINRA algorithm change management response. Monitor per-group rates over rolling windows against baseline; alert when the demographic parity ratio (minority group rate / majority group rate) falls below 0.8 or the disparate impact ratio falls below the regulatory threshold.
Regulatory Compliance from Drift Detection
EU AI Act Article 72 requires post-market monitoring throughout the AI system lifetime — behavioral baseline monitoring with automated drift detection implements this requirement. The monitoring report (current vs. baseline comparison, alerts fired, investigations) satisfies Article 72 documentation. FINRA Regulatory Notices 15-09 and 21-20 require detection of material algorithm changes and documented response — foundation model API updates causing drift constitute material changes; drift detection identifies them retroactively when unannounced. SR 11-7 model risk management requires ongoing monitoring against performance criteria — behavioral baselines are the criteria; drift detection implements continuous monitoring. One technical implementation satisfies all three regulatory frameworks.