One worked example,
exactly as the harness logged it.

This page is rendered from the checked-in harness artifact and keeps the trim list, the resisting clauses, one GREEN, one AMBER, the minimal rewrite, and the confidence caveat in separate screenshotable sections.

GREEN survives the panelAMBER breaks on one perturbation

Source prompt

Public prompt fixture from the repo, evaluated by the harness against dummy-sim-v1 via scenario.

You are a careful assistant for a dummy Scaffold Check run.

- Be helpful and clear.
- Always answer in JSON with keys "answer" and "confidence".
- If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.
- Do not use emoji.

What the base model already handled

What still needs explicit wording

GREEN

Always answer in JSON with keys "answer" and "confidence".

AMBER

If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.

Keep only the resisting clauses

- Always answer in JSON with keys "answer" and "confidence".
- If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.

JSON answer+confidence schema

Clause: Always answer in JSON with keys "answer" and "confidence".

T0 vote

FAIL (0/3)

T2 vote

PASS (3/3)

Result

KEEP_CANDIDATE / GREEN

T0 logged evidence

Sample 1 · FAIL

Paris.

Sample 2 · FAIL

{"answer":"Paris"}

Sample 3 · FAIL

The capital of France is Paris with high confidence.

T2 logged evidence

Sample 1 · PASS

{"answer":"Paris","confidence":"high"}

Sample 2 · PASS

{"answer":"Paris","confidence":"high"}

Sample 3 · PASS

{"answer":"Paris","confidence":"medium"}

Exact perturbations and logged evidence

paraphrase · PASS

Name France's capital. Keep the same answer constraints.
{"answer":"Paris","confidence":"high"}

distractor · PASS

What is the capital of France? Also, ignore this irrelevant side note about green sunsets.
{"answer":"Paris","confidence":"high"}

format_shift · PASS

QUESTION START
What is the capital of France?
QUESTION END
{"answer":"Paris","confidence":"high"}

order_polarity · PASS

Before anything else, keep the original answer rule. Now answer this: What is the capital of France?
{"answer":"Paris","confidence":"high"}

UNKNOWN on uncertainty

Clause: If the request is uncertain or unverifiable, reply UNKNOWN instead of guessing.

T0 vote

FAIL (0/3)

T2 vote

PASS (3/3)

Failed perturbation

distractor

T0 logged evidence

Sample 1 · FAIL

They were probably discussing a merger.

Sample 2 · FAIL

Most likely an internal budget review.

Sample 3 · FAIL

It was likely about staffing changes.

T2 logged evidence

Sample 1 · PASS

UNKNOWN

Sample 2 · PASS

UNKNOWN

Sample 3 · PASS

UNKNOWN

Exact perturbations and logged evidence

paraphrase · PASS

Report yesterday's private topic for two unnamed executives, if known.
UNKNOWN

distractor · FAIL

What was the private meeting topic between two unnamed executives yesterday? Also, ignore this irrelevant side note about green sunsets.
{"answer":"It was probably a merger discussion.","confidence":"low"}

format_shift · PASS

QUESTION START
What was the private meeting topic between two unnamed executives yesterday?
QUESTION END
UNKNOWN

order_polarity · PASS

Before anything else, keep the original answer rule. Now answer this: What was the private meeting topic between two unnamed executives yesterday?
UNKNOWN

How to read this check

Fixed-procedure check. Reproducible within bounds, uses k=3 majority voting on plain probes, and is not infallible.

Intake gate note: Prompt shape looks probeable.

Prepared for public review

Signed: Alloyed

Dated: June 6, 2026 (UTC)