Instruction Elaboration
Instructions with too few tokens are effectively invisible. Instructions padded with generic filler are weaker than shorter, specific ones. The ideal instruction uses multiple DISTINCT relevant terms — each naming a different concrete aspect of the desired behavior.
Antipatterns
- Terse instruction: "Format code." or "Run tests." — too few distinct tokens to register in context. The diagnostic flags instructions below the minimum token count.
- Padded with filler: "When writing tests in this project's codebase, please ensure that you avoid using mock objects." The filler tokens ("when writing", "please ensure that you") dilute signal without adding distinct terms.
- Repetitive terms instead of diverse ones: "Use
rufffor linting.ruffcatches errors.ruffruns fast." Repeating the same term does not increase distinctness — the diagnostic measures unique relevant terms, not total word count. - Generic class names instead of specifics: "Use a testing framework" instead of "Use
pytestwith@pytest.mark.parametrizefor boundary cases intests/." Named constructs are distinct terms; generic descriptions are not.
Pass / Fail
Pass
Use `pytest` with `@pytest.mark.parametrize` for boundary cases in
`tests/unit/`. Run `uv run poe qa_fast` before committing.
*Do not rely on mocking libraries or test doubles.*
Fail
Run tests.
Fix
Elaborate directives with multiple specific, diverse terms, each naming a different concrete aspect: pytest, @pytest.mark.parametrize, tests/unit/, real database connections, real HTTP endpoints. Naming strengthens a directive you want the model to follow. Prohibitions are the inverse: state the forbidden thing as an abstract category rather than a named construct, because naming a prohibited API anchors the forbidden concept instead of suppressing it. Do not pad with generic filler. Phrases like "when writing tests in this project, please ensure that you avoid using" dilute the signal without adding distinct terms.
Limitations
Measures token count and term distinctness. Cannot evaluate whether the chosen terms are the most relevant for the intended behavior.
