This skill helps you evaluate AI security, robustness, and safety using standardized benchmarks across safety, privacy, and adversarial resilience.
npx playbooks add skill pluginagentmarketplace/custom-plugin-ai-red-teaming --skill benchmark-datasets
This skill helps you evaluate AI security, robustness, and safety using standardized benchmarks across safety, privacy, and adversarial resilience.. This skill provides a specialized system prompt that configures your AI coding agent as a benchmark datasets expert, with detailed methodology and structured output formats.
Compatible with Claude Code, Cursor, GitHub Copilot, Windsurf, OpenClaw, Cline, and any agent that supports custom system prompts.
This skill helps you evaluate AI security, robustness, and safety using standardized benchmarks across safety, privacy, and adversarial resilience.