Evaluation And Stopping
Keep scoring, feedback, and termination separate inside Mesmer experiments.
Evaluation, feedback, and stopping are separate responsibilities.
Evaluation
ops.Evaluate records facts. Evaluators produce scores, labels, reasons, or other structured facts about candidate trajectories and target responses.
Use provider-enforced structured output for LLM-backed primitives that need machine-readable output. Free-text LLM calls should stay limited to natural-language outputs.
Common response-shape evaluators include:
evaluators.Containsfor exact substring checks.evaluators.StartsWithfor affirmative-start or prefix-shaped responses.evaluators.NotContainsAnyfor configurable blocked-phrase absence, such as weak refusal-shape checks.evaluators.JudgePanelfor aggregating multiple response evaluators into one panel result.
JudgePanel is still an evaluator. It writes evidence that later operators can consume; it does not decide runtime termination by itself.
Stopping
ops.StopWhen consumes facts. Conditions such as conditions.ScoreAtLeast decide when the runtime should stop based on recorded evidence.
Feedback
Feedback is not evaluation. ops.AddFeedback converts observations and evaluations into future attacker context or learning state.
Keeping these roles separate makes paper workflows easier to reproduce and inspect.