Cost Tracking

What cost tracking monitors

The cost scorer tracks token consumption and spend across all inferences, providing budget alerts before you hit limits.

Metric	Description
Input tokens	Tokens in the prompt (messages + system)
Output tokens	Tokens in the model response
Total tokens	Input + output
Estimated cost	Spend calculated from provider pricing tables
Budget burn rate	Projected hourly/daily spend at current rate

Token pricing

The Probe includes a built-in pricing table for common providers. Costs are calculated per inference and accumulated in the GOVERN platform.

Provider	Model	Input (per 1M)	Output (per 1M)
Anthropic	Claude Sonnet 4	$3.00	$15.00
Anthropic	Claude Haiku 4.5	$0.80	$4.00
OpenAI	GPT-4o	$2.50	$10.00
OpenAI	GPT-4o mini	$0.15	$0.60
Groq	Llama 3.1 70B	$0.59	$0.79

Pricing is updated monthly. Current table: GET /api/govern/probe/policy-sync returns the latest pricing.

Budget configuration

Set token and spend budgets at the hourly level:

scoring:
  cost:
    enabled: true
    budget_tokens_per_hour: 1000000     # 1M tokens/hour
    budget_spend_per_hour_usd: 15.00    # $15/hour
    alert_at_percent: 0.80              # Alert at 80% of budget
    block_at_percent: 1.00              # Block at 100% (optional)

When block_at_percent is set to 1.00 and scoring.mode is block, new inferences are rejected once the hourly budget is exhausted. The budget resets at the top of each hour.

Cost score interpretation

The cost score is a budget utilization ratio, not a quality score:

Score	Meaning
0.00 – 0.50	Under 50% of hourly budget used
0.50 – 0.80	50–80% of budget used (normal)
0.80 – 1.00	80–100% of budget used (alert threshold)
> 1.00	Budget exceeded

Per-model breakdown

Cost data in the GOVERN dashboard is broken down by model, enabling you to see which models are driving spend:

claude-sonnet-4:     $12.40  (68%)
claude-haiku-4.5:     $4.20  (23%)
gpt-4o:               $1.60   (9%)

Anomaly detection

The cost scorer also detects unusual spending patterns:

Spike detection — a single inference using 10x the normal token count
Model substitution — sudden shift to a more expensive model
Runaway loops — very high inference frequency in a short window

These are reported as cost.anomaly events in the GOVERN platform.

Environment variable configuration

SCORING_COST_ENABLED=true
SCORING_COST_BUDGET_TOKENS=1000000
SCORING_COST_BUDGET_USD=15.00
SCORING_COST_ALERT_PERCENT=0.80