Ring Buffer and Batch Flushing

Architecture

GOVERN Probe uses a ring buffer + batch flush pattern to transmit telemetry without adding latency to the inference path.

Inference arrives
      │
      ▼
Proxy to upstream ──────────────────────────────▶ Model API
      │
      ▼ (concurrent, non-blocking)
Score the response
      │
      ▼
Push event to ring buffer
      │
      └── Ring buffer background thread
              │
              ├── Every 5 seconds (or when batch is full):
              │       Pull up to 50 events
              │       POST /api/govern/probe/telemetry
              │       On success: remove from buffer
              │       On failure: retry up to 3 times
              │       On max retries: drop + log
              │
              └── Buffer full (1000 events):
                      Drop oldest events
                      Increment govern_telemetry_dropped_total counter

Ring buffer

The ring buffer holds scored inference events in memory until they are flushed:

Parameter	Default	Description
`ring_buffer_size`	1000	Maximum events held before dropping
`flush_interval_ms`	5000	Flush interval in milliseconds
`batch_size`	50	Events per flush batch

When the buffer fills (network outage, GOVERN platform unavailable), the oldest events are dropped. This ensures the Probe never consumes unbounded memory.

Flush behavior

A flush is triggered by either:

The flush timer fires (default: every 5 seconds)
The batch reaches batch_size events

Whichever occurs first triggers a flush. Under high load, flushes happen more frequently due to batch fills.

Retry logic

Failed flushes are retried with exponential backoff:

Attempt	Wait
1st retry	1 second
2nd retry	2 seconds
3rd retry	4 seconds
Drop	After 3 retries, events are dropped

Dropped events are counted in govern_telemetry_dropped_total. If this counter is rising, check:

Network connectivity to api.govern.archetypal.ai
API key validity
Probe container outbound firewall rules

Configuration

telemetry:
  flush_interval_ms: 5000      # default: 5000
  batch_size: 50               # default: 50
  ring_buffer_size: 1000       # default: 1000
  max_retries: 3               # default: 3
  retry_backoff_ms: 1000       # default: 1000

Tuning for high-throughput applications

For applications making >100 inferences per second, increase the ring buffer and batch size:

telemetry:
  ring_buffer_size: 5000
  batch_size: 200
  flush_interval_ms: 2000

This reduces flush frequency and network overhead. Monitor govern_telemetry_batches_flushed_total and govern_telemetry_events_per_batch_avg to find the optimal balance.

Telemetry payload

Each event in a batch contains:

{
  "event_id": "evt_01HXYZABC",
  "probe_id": "probe-production-1",
  "org_id": "org_xxxx",
  "timestamp": "2026-04-12T14:23:01.432Z",
  "model": "claude-sonnet-4-20250514",
  "provider": "anthropic",
  "latency_ms": 2134,
  "input_tokens": 847,
  "output_tokens": 312,
  "scores": {
    "security": 0.02,
    "bias": 0.01,
    "accuracy": 0.91,
    "drift": 0.04,
    "cost": 0.08
  },
  "action": "pass",
  "flags": [],
  "metadata": {
    "environment": "production",
    "app_version": "2.1.0"
  }
}

When telemetry.include_content is true, the payload also includes request_content and response_content fields.