E

Evaluation

Core Questions

  • ?What criteria determine whether the output is acceptable?
  • ?What scoring scale and thresholds apply?
  • ?What happens if the output falls below the threshold?
  • ?How many revision cycles are permitted?

Micro-Template

Evaluate against: (1) [Criterion] — Score 1-5. (2) [Criterion] — Score 1-5. If any score < [threshold], revise [component] and regenerate. Max [N] cycles.

Anti-Patterns

Error Correction
No evaluation criteria defined Always include at least 2-3 measurable criteria for professional outputs.
Vague quality language ('make it good') Replace subjective terms with specific, scorable criteria tied to the Object.
No threshold for acceptable quality Set a minimum score threshold that triggers revision if not met.
No iteration mechanism Define what happens when output fails evaluation: which component to revise, how many cycles.
Evaluating against criteria not reflected in the prompt Ensure every evaluation criterion maps to a component specified earlier in the prompt.

Self-Check

  • Does each evaluation criterion trace back to a specific component?
  • Is the scoring scale clear and actionable?
  • Would a human reviewer agree these are the right criteria for this task?

Interaction Note

Evaluation closes the loop. It transforms prompt engineering from a one-shot activity into a systematic, iterative process. E criteria should map directly to M (goal alignment), O (deliverable quality), and T (methodological rigor).

Next
Tiers