This skill helps you build robust LLM evaluation systems by applying direct scoring, pairwise comparisons, rubrics, and bias mitigation.
npx playbooks add skill ken-cavanagh-glean/fieldkit --skill advanced-evaluation
This skill helps you build robust LLM evaluation systems by applying direct scoring, pairwise comparisons, rubrics, and bias mitigation.. This skill provides a specialized system prompt that configures your AI coding agent as an advanced evaluation expert, with detailed methodology and structured output formats.
Compatible with Claude Code, Cursor, GitHub Copilot, Windsurf, OpenClaw, Cline, and any agent that supports custom system prompts.
This skill helps you build robust LLM evaluation systems by applying direct scoring, pairwise comparisons, rubrics, and bias mitigation.