Check types
| type | params | Outcome | Passes when |
|---|---|---|---|
| task_expectations | — | graded | Output satisfies the task's mustMention / mustNotMention expectations. |
| contains | value, caseSensitive? | binary | Output contains the substring. Alias: must_contain. |
| not_contains | value, caseSensitive? | binary | Output does not contain the substring. Alias: must_not_contain. |
| regex | pattern | binary | re.search finds the pattern in the output. |
| equals | value, caseSensitive? | binary | Trimmed output equals the trimmed value. Alias: exact_match. |
| min_length | value | binary | len(output) ≥ value. |
| max_length | value | binary | len(output) ≤ value. |
| json_valid | — | binary | Output parses with json.loads (strict). |
| json_keys | requiredKeys | binary | Output is a JSON object containing every required key. |
| expected_output_schema | — | binary | Output is a JSON object with the required keys from the task's expected output schema. |
Matching for the substring and equality checks is case-insensitive unless caseSensitive: true is set in params. The value for contains / not_contains may be given as value or text; for json_keys the keys may be given as requiredKeys or keys.
Outcomes & scoring
- Binary checks return
1.0on pass,0.0on fail. - Graded checks (
task_expectations) return a fraction in[0, 1]for partial credit. - A verifier's score is the weighted mean of its checks (default
weight: 1). Across multiple verifiers, the final score is the plain mean. passThreshold(default1.0) is the score at or above which the output passes. A check marked"required": truemakes any failure of it critical — the output cannot pass regardless of the numeric score.
Note
See Verifiers for the full spec, weighting strategy, and a worked example.
task_expectations
Reads task.metadata.expectations with mustMention and mustNotMention lists. Each entry uses anyOf (a list of acceptable phrases) or a single text phrase, plus a message used as the failure reason. The score is (total − failures) / total across all entries, so it gives partial credit and a concrete reason per missed expectation — exactly the feedback GEPA reflects on.
Skips & errors
An unknown check type is skipped — it does not fail the output, and the skip is noted in the feedback. A check that raises is treated as a failure with the exception type in the reason, so one broken check can never crash a run.