Get the latest tech news
Not so prompt: Prompt optimization as model selection (2024)
Here's a framework for prompt optimization: Defining Success: Metrics and Evaluation Criteria Before collecting any data, establish what success looks like for your specific use case. Choose a primary metric that directly reflects business value—accuracy for classification, F1 for imbalanced datasets, BLEU/ROUGE for generation tasks, or custom domain-specific
Randomize the order of responses being compared, normalize for length biases, use structured rubrics rather than open-ended judgments, and periodically validate against human evaluation. Instruction: The core task description Constraints: Guardrails and requirements Reasoning: Chain-of-thought scaffolding or step-by-step guidance Schema: Output format specifications Demonstrations: Few-shot examples Define bounded edit operators that modify these components systematically: rephrasing instructions for clarity, adding or removing constraints, reordering reasoning steps, swapping demonstration examples.
Or read this on Hacker News