Skip to content

Scoring Modes

Three modes

GOVERN Probe operates in one of three modes, set via SCORING_MODE or the scoring.mode YAML key:

ModeTraffic effectAlertUse when
logNone — always passesNoStarting out, baselining behavior
flagNone — always passesYesMonitoring with human review
block422 returned when threshold exceededYesCompliance enforcement

Log mode

Terminal window
SCORING_MODE=log

Every inference passes through unchanged. Scores are computed and telemetry is flushed, but no alerts are emitted and no requests are blocked. This is the default mode.

Use log mode to:

  • Establish a behavioral baseline (minimum 100 inferences before drift detection is useful)
  • Understand your application’s normal score distribution
  • Test the Probe without any risk to production traffic

Sample log output:

{"level":"info","msg":"inference scored","mode":"log","security":0.02,"bias":0.01,"accuracy":0.91,"drift":0.04,"cost":0.08,"action":"pass"}

Flag mode

Terminal window
SCORING_MODE=flag

Every inference passes through unchanged. When any score exceeds its threshold, a FLAG event is emitted to the GOVERN platform. Alerts fire in the dashboard, webhook events are dispatched, and the inference is tagged as a policy concern.

Use flag mode for:

  • Production monitoring with human-in-the-loop review
  • Soft enforcement: visibility without risk of blocking legitimate traffic
  • High-stakes applications where false positives from blocking would be costly

Sample log output:

{"level":"warn","msg":"inference flagged","mode":"flag","security":0.83,"threshold":0.70,"action":"flag","flag_reason":"credential_exposure","event_id":"evt_01HXYZ"}

Block mode

Terminal window
SCORING_MODE=block

When any enabled scorer exceeds its threshold, the Probe returns a 422 Unprocessable Entity to the caller instead of the model response. The response body indicates which scorer triggered the block.

Block response:

HTTP/1.1 422 Unprocessable Entity
Content-Type: application/json
{
"error": {
"type": "govern_policy_violation",
"message": "Inference blocked by GOVERN Probe",
"violations": [
{
"scorer": "security",
"score": 0.87,
"threshold": 0.70,
"reason": "potential_credential_exposure"
}
],
"event_id": "evt_01HXYZ",
"probe_id": "probe-production-1"
}
}

Use block mode for:

  • Regulatory compliance environments (EU AI Act, NIST RMF)
  • Applications handling sensitive data where policy violations must never pass
  • Post-baselining enforcement after flag mode has been tuned

Per-scorer mode override

You can run different modes per scorer. For example: block on security violations, flag on bias, log accuracy:

config/default.yaml
scoring:
mode: log # default for all scorers
security:
mode: block # override: block on security violations
threshold: 0.70
bias:
mode: flag # override: flag on bias
threshold: 0.60
accuracy:
mode: log # explicitly log (same as default)

Threshold tuning guide

Thresholds are scores from 0.0 (not detected) to 1.0 (maximum confidence). A score of 0.70 means the scorer is 70% confident the dimension has been violated.

Starting thresholdMeaning
0.90Only flag/block on very high confidence findings
0.70Standard production default
0.50Aggressive — catches more, more false positives
0.30Very aggressive — research/audit only

Recommended tuning sequence:

  1. Deploy in log mode with default thresholds (0.70)
  2. Review 500+ inferences in the GOVERN dashboard
  3. Identify p95 scores for each dimension in normal traffic
  4. Set thresholds to p95 + 0.10 buffer
  5. Switch to flag mode, monitor alerts for 1 week
  6. Adjust thresholds based on false positive rate
  7. Switch to block mode when false positive rate is acceptable