Scoring Modes
Three modes
GOVERN Probe operates in one of three modes, set via SCORING_MODE or the scoring.mode YAML key:
| Mode | Traffic effect | Alert | Use when |
|---|---|---|---|
log | None — always passes | No | Starting out, baselining behavior |
flag | None — always passes | Yes | Monitoring with human review |
block | 422 returned when threshold exceeded | Yes | Compliance enforcement |
Log mode
SCORING_MODE=logEvery inference passes through unchanged. Scores are computed and telemetry is flushed, but no alerts are emitted and no requests are blocked. This is the default mode.
Use log mode to:
- Establish a behavioral baseline (minimum 100 inferences before drift detection is useful)
- Understand your application’s normal score distribution
- Test the Probe without any risk to production traffic
Sample log output:
{"level":"info","msg":"inference scored","mode":"log","security":0.02,"bias":0.01,"accuracy":0.91,"drift":0.04,"cost":0.08,"action":"pass"}Flag mode
SCORING_MODE=flagEvery inference passes through unchanged. When any score exceeds its threshold, a FLAG event is emitted to the GOVERN platform. Alerts fire in the dashboard, webhook events are dispatched, and the inference is tagged as a policy concern.
Use flag mode for:
- Production monitoring with human-in-the-loop review
- Soft enforcement: visibility without risk of blocking legitimate traffic
- High-stakes applications where false positives from blocking would be costly
Sample log output:
{"level":"warn","msg":"inference flagged","mode":"flag","security":0.83,"threshold":0.70,"action":"flag","flag_reason":"credential_exposure","event_id":"evt_01HXYZ"}Block mode
SCORING_MODE=blockWhen any enabled scorer exceeds its threshold, the Probe returns a 422 Unprocessable Entity to the caller instead of the model response. The response body indicates which scorer triggered the block.
Block response:
HTTP/1.1 422 Unprocessable EntityContent-Type: application/json
{ "error": { "type": "govern_policy_violation", "message": "Inference blocked by GOVERN Probe", "violations": [ { "scorer": "security", "score": 0.87, "threshold": 0.70, "reason": "potential_credential_exposure" } ], "event_id": "evt_01HXYZ", "probe_id": "probe-production-1" }}Use block mode for:
- Regulatory compliance environments (EU AI Act, NIST RMF)
- Applications handling sensitive data where policy violations must never pass
- Post-baselining enforcement after flag mode has been tuned
Per-scorer mode override
You can run different modes per scorer. For example: block on security violations, flag on bias, log accuracy:
scoring: mode: log # default for all scorers security: mode: block # override: block on security violations threshold: 0.70 bias: mode: flag # override: flag on bias threshold: 0.60 accuracy: mode: log # explicitly log (same as default)Threshold tuning guide
Thresholds are scores from 0.0 (not detected) to 1.0 (maximum confidence). A score of 0.70 means the scorer is 70% confident the dimension has been violated.
| Starting threshold | Meaning |
|---|---|
0.90 | Only flag/block on very high confidence findings |
0.70 | Standard production default |
0.50 | Aggressive — catches more, more false positives |
0.30 | Very aggressive — research/audit only |
Recommended tuning sequence:
- Deploy in
logmode with default thresholds (0.70) - Review 500+ inferences in the GOVERN dashboard
- Identify p95 scores for each dimension in normal traffic
- Set thresholds to p95 + 0.10 buffer
- Switch to
flagmode, monitor alerts for 1 week - Adjust thresholds based on false positive rate
- Switch to
blockmode when false positive rate is acceptable