Accuracy Scoring

What the accuracy scorer detects

The accuracy scorer identifies responses where the model makes claims that are not grounded in provided context, or where the response contains patterns consistent with hallucination.

Detection type	What it finds
Hallucination	Confident statements about facts the model cannot know
Grounding failures	Claims not supported by provided documents or context
Fabricated citations	References to non-existent sources, papers, or people
Date/number errors	Specific numeric claims that are implausible or contradicted

Hallucination detection

Hallucination detection runs two complementary checks:

Confidence-claim mismatch

The model expresses high confidence about specific claims (statistics, names, dates, URLs) in contexts where such confidence is unwarranted. Indicators:

Specific numeric claims without hedging (“The report showed 73.2% adoption…”)
Named citations with specific publication details
URLs and email addresses that appear fabricated
Quotes attributed to real people

Factual plausibility

The scorer runs a lightweight plausibility check on structured claims (years, common facts). Claims that fall far outside known ranges are flagged. This check does not require internet access — it uses an embedded knowledge index.

Grounding analysis

When your prompt includes context documents (RAG retrieval, knowledge base chunks), the accuracy scorer verifies that the model’s response is grounded in the provided context rather than in parametric knowledge.

For grounding analysis to work, the Probe needs to identify context documents in the request. It looks for common RAG patterns:

System: You are a helpful assistant. Answer questions based on the following documents:
---
[Document 1]
[content...]
---
[Document 2]
[content...]
---

The scorer computes overlap between claims in the response and facts present in the provided documents. Low overlap with high-confidence claims = grounding failure.

Configuration for RAG applications:

scoring:
  accuracy:
    enabled: true
    threshold: 0.65
    check_hallucination: true
    check_grounding: true
    grounding_overlap_threshold: 0.40
    rag_context_markers:
      - "---"
      - "[Document"
      - "<context>"
      - "Based on the following"

Score interpretation

Score	Meaning	Recommended action
0.00 – 0.30	Well-grounded, consistent response	Pass
0.31 – 0.50	Low-level accuracy concerns	Log for review
0.51 – 0.70	Moderate accuracy issues	Flag for human review
0.71 – 1.00	High-confidence hallucination or grounding failure	Flag or block

Configuration

scoring:
  accuracy:
    enabled: true
    threshold: 0.65
    check_hallucination: true
    check_grounding: true
    check_citations: true
    check_numeric_plausibility: true
    grounding_overlap_threshold: 0.40

Limitations

The accuracy scorer does not have access to external knowledge or the internet. It cannot verify factual claims against current data. Its hallucination detection is based on:

Structural patterns in the response (specificity without hedging)
Internal consistency of the response
Grounding overlap (only when context is provided)
A compact embedded knowledge index for common factual ranges

For applications where factual accuracy is critical, supplement the Probe’s accuracy scoring with domain-specific verification layers.