Metrics Reference

Metrics endpoint

GOVERN Probe exposes Prometheus-format metrics at /metrics (configurable via HEALTH_METRICS_PATH).

curl http://localhost:4020/metrics

Inference metrics

Metric	Type	Description
`govern_inferences_total`	Counter	Total inference requests proxied
`govern_inferences_scored_total`	Counter	Inferences that completed scoring
`govern_inferences_flagged_total`	Counter	Inferences that exceeded a threshold (flag/block mode)
`govern_inferences_blocked_total`	Counter	Inferences blocked (block mode only)
`govern_inferences_unscored_total`	Counter	Inferences where scoring failed

Labels: model, provider, scorer (for flagged/blocked)

Latency metrics

Metric	Type	Description
`govern_proxy_latency_ms`	Histogram	End-to-end proxy latency (p50, p95, p99)
`govern_upstream_latency_ms`	Histogram	Time waiting for upstream model response
`govern_scoring_latency_ms`	Histogram	Time to score the inference (async, not in path)

Score distribution metrics

Metric	Type	Description
`govern_score_security`	Histogram	Distribution of security scores
`govern_score_bias`	Histogram	Distribution of bias scores
`govern_score_accuracy`	Histogram	Distribution of accuracy scores
`govern_score_drift`	Histogram	Distribution of drift scores
`govern_score_cost`	Histogram	Distribution of cost utilization scores

Telemetry metrics

Metric	Type	Description
`govern_telemetry_batches_flushed_total`	Counter	Total batches successfully flushed
`govern_telemetry_events_flushed_total`	Counter	Total events successfully transmitted
`govern_telemetry_dropped_total`	Counter	Events dropped (ring buffer overflow or max retries)
`govern_telemetry_retry_total`	Counter	Flush retry attempts
`govern_ring_buffer_size`	Gauge	Current events in the ring buffer
`govern_ring_buffer_utilization`	Gauge	Ring buffer fill percentage (0.0–1.0)

Token and cost metrics

Metric	Type	Description
`govern_tokens_input_total`	Counter	Total input tokens proxied
`govern_tokens_output_total`	Counter	Total output tokens proxied
`govern_cost_usd_total`	Counter	Estimated total spend in USD
`govern_tokens_per_hour`	Gauge	Rolling hourly token rate
`govern_cost_usd_per_hour`	Gauge	Rolling hourly spend rate
`govern_budget_utilization`	Gauge	Budget burn percentage (0.0–1.0)

Health metrics

Metric	Type	Description
`govern_probe_up`	Gauge	1 if probe is running, 0 if not
`govern_upstream_reachable`	Gauge	1 if upstream is reachable, 0 if not
`govern_telemetry_connected`	Gauge	1 if GOVERN platform is reachable
`govern_build_info`	Gauge	Probe version (via labels)

Sample Prometheus scrape output

# HELP govern_inferences_total Total inference requests proxied
# TYPE govern_inferences_total counter
govern_inferences_total{model="claude-sonnet-4",provider="anthropic"} 1247

# HELP govern_proxy_latency_ms End-to-end proxy latency in milliseconds
# TYPE govern_proxy_latency_ms histogram
govern_proxy_latency_ms_bucket{le="1"} 0
govern_proxy_latency_ms_bucket{le="5"} 987
govern_proxy_latency_ms_bucket{le="10"} 1241
govern_proxy_latency_ms_bucket{le="+Inf"} 1247
govern_proxy_latency_ms_sum 3982.4
govern_proxy_latency_ms_count 1247

# HELP govern_ring_buffer_utilization Ring buffer fill percentage
# TYPE govern_ring_buffer_utilization gauge
govern_ring_buffer_utilization 0.12

Grafana dashboard

A pre-built Grafana dashboard is available for import:

# Dashboard ID for Grafana.com
GOVERN Probe Overview: 21847

Or import from the GOVERN dashboard: Settings → Integrations → Grafana → Export Dashboard JSON.