Drift Detection
What is behavioral drift?
Behavioral drift occurs when an AI model’s outputs change systematically over time in ways that were not intended. Drift can result from:
- Model updates — the model provider silently deploys a new version
- Prompt changes — your team modifies prompts without realizing the downstream impact
- Distribution shift — the nature of incoming queries changes
- Jailbreak accumulation — adversarial inputs gradually shifting model behavior
GOVERN Probe’s drift detector establishes a behavioral baseline from your first inferences and continuously compares new inferences against that baseline.
How drift is measured
The drift detector measures change across five behavioral dimensions:
| Dimension | What’s measured |
|---|---|
| Response length | Mean and variance of response token count |
| Tone | Sentiment distribution (positive/neutral/negative) |
| Format adherence | Structure consistency (markdown usage, list frequency) |
| Topic distribution | Semantic topic clusters across responses |
| Refusal rate | Frequency of model refusals and safety responses |
Each dimension produces a component drift score. The final drift score is a weighted average:
drift_score = 0.3 × topic_drift + 0.25 × tone_drift + 0.2 × length_drift + 0.15 × format_drift + 0.10 × refusal_driftBaseline establishment
Drift detection requires a minimum number of inferences to establish a reliable baseline:
scoring: drift: min_baseline_inferences: 100 # default: 100 baseline_window_hours: 168 # default: 7 daysDuring baseline establishment, drift scores are computed but no alerts fire (drift_score is returned as null in telemetry until baseline is ready).
The baseline is a rolling window — old inferences age out as new ones arrive. This means the baseline reflects your application’s recent normal behavior, not its behavior from months ago.
Alert conditions
Drift alerts fire when:
- The baseline is established (min inferences reached)
- The current drift score exceeds the threshold
- The elevated drift persists for at least 3 consecutive inference batches (to reduce noise)
scoring: drift: enabled: true threshold: 0.25 alert_persistence_batches: 3 baseline_window_hours: 168 min_baseline_inferences: 100Drift event types
| Event | Description |
|---|---|
drift.baseline_established | Enough inferences collected for baseline |
drift.threshold_exceeded | Drift score crossed threshold for first time |
drift.sustained | Drift persists for alert_persistence_batches |
drift.recovered | Drift score returned below threshold |
drift.baseline_reset | Manual or automatic baseline reset |
Manual baseline reset
If you intentionally change your prompts or model, reset the baseline so the new behavior becomes the new normal:
# Via APIcurl -X POST https://api.govern.archetypal.ai/v1/probes/probe_xxxx/drift/reset \ -H "Authorization: Bearer gvn_live_xxxx" \ -H "Content-Type: application/json" \ -d '{"reason": "Upgraded to Claude Sonnet 4 — new model version"}'After a reset, the baseline establishment period begins again.
Dashboard analysis
The GOVERN dashboard shows drift over time as a time series. You can:
- See which dimensions are drifting (topic vs. tone vs. format)
- Compare inference samples from baseline vs. current period
- Annotate drift events with deployment notes
- Set up Slack/PagerDuty alerts for sustained drift