scorable-skills
💡 摘要
Scorable技能将LLM作为评审的评估者集成到应用程序中,以增强对LLM输出的评估。
🎯 适合人群
🤖 AI 吐槽: “看起来很能打,但别让配置把人劝退。”
风险:Medium。建议检查:是否执行 shell/命令行指令;是否发起外网请求(SSRF/数据外发)。以最小权限运行,并在生产环境启用前审计代码与依赖。
Scorable Skills
Skills for integrating and using Scorable LLM-as-a-Judge evaluators into applications with LLM interactions.
What these skills do
- scorable-integration: Guides you through integrating Scorable LLM-as-a-Judge evaluators into your codebase.
Installation
npx skills add root-signals/scorable-skills
Usage
The skill automatically activates when you mention evaluation, judges, or Scorable. It works with frameworks like LangChain, PydanticAI, Mastra, and similar agent frameworks.
Examples
Basic integration:
Help me add Scorable evaluation to my chatbot
Framework-specific:
Integrate Scorable judges into my LangChain application
Analysis and setup:
Analyze my codebase for LLM interactions and help me set up Scorable evaluation
Production deployment:
Set up production sampling for Scorable evaluation with 10% coverage
About Scorable
Scorable is a tool for creating LLM-as-a-Judge based evaluators for safeguarding applications. It generates custom evaluators (judges) that assess LLM outputs for quality, safety, and policy adherence.
优点
- 与现有框架无缝集成。
- 提高LLM输出的评估质量。
- 支持多种代理框架。
缺点
- 高级功能的文档有限。
- 依赖于特定框架。
- 生产环境可能需要额外设置。
相关技能
免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。
版权归原作者所有 root-signals.
