Metrics endpoint
GOVERN Probe exposes Prometheus-format metrics at /metrics (configurable via HEALTH_METRICS_PATH).
curl http://localhost:4020/metrics
Inference metrics
| Metric | Type | Description |
|---|
govern_inferences_total | Counter | Total inference requests proxied |
govern_inferences_scored_total | Counter | Inferences that completed scoring |
govern_inferences_flagged_total | Counter | Inferences that exceeded a threshold (flag/block mode) |
govern_inferences_blocked_total | Counter | Inferences blocked (block mode only) |
govern_inferences_unscored_total | Counter | Inferences where scoring failed |
Labels: model, provider, scorer (for flagged/blocked)
Latency metrics
| Metric | Type | Description |
|---|
govern_proxy_latency_ms | Histogram | End-to-end proxy latency (p50, p95, p99) |
govern_upstream_latency_ms | Histogram | Time waiting for upstream model response |
govern_scoring_latency_ms | Histogram | Time to score the inference (async, not in path) |
Score distribution metrics
| Metric | Type | Description |
|---|
govern_score_security | Histogram | Distribution of security scores |
govern_score_bias | Histogram | Distribution of bias scores |
govern_score_accuracy | Histogram | Distribution of accuracy scores |
govern_score_drift | Histogram | Distribution of drift scores |
govern_score_cost | Histogram | Distribution of cost utilization scores |
Telemetry metrics
| Metric | Type | Description |
|---|
govern_telemetry_batches_flushed_total | Counter | Total batches successfully flushed |
govern_telemetry_events_flushed_total | Counter | Total events successfully transmitted |
govern_telemetry_dropped_total | Counter | Events dropped (ring buffer overflow or max retries) |
govern_telemetry_retry_total | Counter | Flush retry attempts |
govern_ring_buffer_size | Gauge | Current events in the ring buffer |
govern_ring_buffer_utilization | Gauge | Ring buffer fill percentage (0.0–1.0) |
Token and cost metrics
| Metric | Type | Description |
|---|
govern_tokens_input_total | Counter | Total input tokens proxied |
govern_tokens_output_total | Counter | Total output tokens proxied |
govern_cost_usd_total | Counter | Estimated total spend in USD |
govern_tokens_per_hour | Gauge | Rolling hourly token rate |
govern_cost_usd_per_hour | Gauge | Rolling hourly spend rate |
govern_budget_utilization | Gauge | Budget burn percentage (0.0–1.0) |
Health metrics
| Metric | Type | Description |
|---|
govern_probe_up | Gauge | 1 if probe is running, 0 if not |
govern_upstream_reachable | Gauge | 1 if upstream is reachable, 0 if not |
govern_telemetry_connected | Gauge | 1 if GOVERN platform is reachable |
govern_build_info | Gauge | Probe version (via labels) |
Sample Prometheus scrape output
# HELP govern_inferences_total Total inference requests proxied
# TYPE govern_inferences_total counter
govern_inferences_total{model="claude-sonnet-4",provider="anthropic"} 1247
# HELP govern_proxy_latency_ms End-to-end proxy latency in milliseconds
# TYPE govern_proxy_latency_ms histogram
govern_proxy_latency_ms_bucket{le="1"} 0
govern_proxy_latency_ms_bucket{le="5"} 987
govern_proxy_latency_ms_bucket{le="10"} 1241
govern_proxy_latency_ms_bucket{le="+Inf"} 1247
govern_proxy_latency_ms_sum 3982.4
govern_proxy_latency_ms_count 1247
# HELP govern_ring_buffer_utilization Ring buffer fill percentage
# TYPE govern_ring_buffer_utilization gauge
govern_ring_buffer_utilization 0.12
Grafana dashboard
A pre-built Grafana dashboard is available for import:
# Dashboard ID for Grafana.com
GOVERN Probe Overview: 21847
Or import from the GOVERN dashboard: Settings → Integrations → Grafana → Export Dashboard JSON.