E
Evaluation
Core Questions
- ?What criteria determine whether the output is acceptable?
- ?What scoring scale and thresholds apply?
- ?What happens if the output falls below the threshold?
- ?How many revision cycles are permitted?
Micro-Template
Evaluate against: (1) [Criterion] — Score 1-5. (2) [Criterion] — Score 1-5. If any score < [threshold], revise [component] and regenerate. Max [N] cycles.
Anti-Patterns
| Error | Correction |
|---|---|
| No evaluation criteria defined | Always include at least 2-3 measurable criteria for professional outputs. |
| Vague quality language ('make it good') | Replace subjective terms with specific, scorable criteria tied to the Object. |
| No threshold for acceptable quality | Set a minimum score threshold that triggers revision if not met. |
| No iteration mechanism | Define what happens when output fails evaluation: which component to revise, how many cycles. |
| Evaluating against criteria not reflected in the prompt | Ensure every evaluation criterion maps to a component specified earlier in the prompt. |
Self-Check
- ✔Does each evaluation criterion trace back to a specific component?
- ✔Is the scoring scale clear and actionable?
- ✔Would a human reviewer agree these are the right criteria for this task?
Interaction Note
Evaluation closes the loop. It transforms prompt engineering from a one-shot activity into a systematic, iterative process. E criteria should map directly to M (goal alignment), O (deliverable quality), and T (methodological rigor).