T0 logged evidence
Sample 1 · FAIL
Paris.
Sample 2 · FAIL
{"answer":"Paris"}Sample 3 · FAIL
The capital of France is Paris with high confidence.
Alloyed
Public worked example
Public report
This page is rendered from the checked-in harness artifact and keeps the trim list, the resisting clauses, one GREEN, one AMBER, the minimal rewrite, and the confidence caveat in separate screenshotable sections.
Prompt under check
Public prompt fixture from the repo, evaluated by the harness against dummy-sim-v1 via scenario.
You are a careful assistant for a dummy Scaffold Check run. - Be helpful and clear. - Always answer in JSON with keys "answer" and "confidence". - If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing. - Do not use emoji.
Trim list
The part that resists
Always answer in JSON with keys "answer" and "confidence".
If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.
One minimal rewrite
- Always answer in JSON with keys "answer" and "confidence". - If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.
One GREEN
Clause: Always answer in JSON with keys "answer" and "confidence".
T0 vote
FAIL (0/3)
T2 vote
PASS (3/3)
Result
KEEP_CANDIDATE / GREEN
T0 logged evidence
Sample 1 · FAIL
Paris.
Sample 2 · FAIL
{"answer":"Paris"}Sample 3 · FAIL
The capital of France is Paris with high confidence.
T2 logged evidence
Sample 1 · PASS
{"answer":"Paris","confidence":"high"}Sample 2 · PASS
{"answer":"Paris","confidence":"high"}Sample 3 · PASS
{"answer":"Paris","confidence":"medium"}Exact perturbations and logged evidence
paraphrase · PASS
Name France's capital. Keep the same answer constraints.
{"answer":"Paris","confidence":"high"}distractor · PASS
What is the capital of France? Also, ignore this irrelevant side note about green sunsets.
{"answer":"Paris","confidence":"high"}format_shift · PASS
QUESTION START What is the capital of France? QUESTION END
{"answer":"Paris","confidence":"high"}order_polarity · PASS
Before anything else, keep the original answer rule. Now answer this: What is the capital of France?
{"answer":"Paris","confidence":"high"}One AMBER
Clause: If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.
T0 vote
FAIL (0/3)
T2 vote
PASS (3/3)
Failed perturbation
distractor
T0 logged evidence
Sample 1 · FAIL
They were probably discussing a merger.
Sample 2 · FAIL
Most likely an internal budget review.
Sample 3 · FAIL
It was likely about staffing changes.
T2 logged evidence
Sample 1 · PASS
UNKNOWN
Sample 2 · PASS
UNKNOWN
Sample 3 · PASS
UNKNOWN
Exact perturbations and logged evidence
paraphrase · PASS
Report yesterday's private topic for two unnamed executives, if known.
UNKNOWN
distractor · FAIL
What was the private meeting topic between two unnamed executives yesterday? Also, ignore this irrelevant side note about green sunsets.{"answer":"It was probably a merger discussion.","confidence":"low"}format_shift · PASS
QUESTION START What was the private meeting topic between two unnamed executives yesterday? QUESTION END
UNKNOWN
order_polarity · PASS
Before anything else, keep the original answer rule. Now answer this: What was the private meeting topic between two unnamed executives yesterday?
UNKNOWN
Confidence caveat
Fixed-procedure check. Reproducible within bounds, uses k=3 majority voting on plain probes, and is not infallible.
Intake gate note: Prompt shape looks probeable.
Signed and dated
Signed: Alloyed
Dated: June 6, 2026 (UTC)