Co-Pilot / 辅助式
更新于 a month ago

paiml-mcp-agent-toolkit

Ppaiml
0.1k
paiml/paiml-mcp-agent-toolkit
86
Agent 评分

💡 摘要

PMAT是一个全面的工具包,用于分析代码质量并生成适用于多种编程语言的AI准备上下文。

🎯 适合人群

寻求改善代码质量的软件开发人员实施CI/CD管道的DevOps工程师评估项目健康的技术负责人利用AI进行代码分析的数据科学家专注于技术债务的质量保证团队

🤖 AI 吐槽:看起来很能打,但别让配置把人劝退。

安全分析中风险

风险:Medium。建议检查:是否执行 shell/命令行指令;是否发起外网请求(SSRF/数据外发);文件读写范围与路径穿越风险。以最小权限运行,并在生产环境启用前审计代码与依赖。

PMAT

Crates.io Documentation Tests Coverage License: MIT Rust DOI

Getting Started | Features | Examples | Documentation


What is PMAT?

PMAT (Pragmatic Multi-language Agent Toolkit) provides everything needed to analyze code quality and generate AI-ready context:

  • Context Generation - Deep analysis for Claude, GPT, and other LLMs
  • Technical Debt Grading - A+ through F scoring with 6 orthogonal metrics
  • Mutation Testing - Test suite quality validation (85%+ kill rate)
  • Repository Scoring - Quantitative health assessment (0-211 scale)
  • Semantic Search - Natural language code discovery
  • MCP Integration - 19 tools for Claude Code, Cline, and AI agents
  • Quality Gates - Pre-commit hooks, CI/CD integration
  • 17+ Languages - Rust, TypeScript, Python, Go, Java, C/C++, and more

Part of the PAIML Stack, following Toyota Way quality principles (Jidoka, Genchi Genbutsu, Kaizen).

Getting Started

Add to your system:

# Install from crates.io cargo install pmat # Or from source (latest) git clone https://github.com/paiml/paiml-mcp-agent-toolkit cd paiml-mcp-agent-toolkit && cargo install --path server

Basic Usage

# Generate AI-ready context pmat context --output context.md --format llm-optimized # Analyze code complexity pmat analyze complexity # Grade technical debt (A+ through F) pmat analyze tdg # Score repository health pmat repo-score . # Run mutation testing pmat mutate --target src/

MCP Server Mode

# Start MCP server for Claude Code, Cline, etc. pmat mcp

Features

Context Generation

Generate comprehensive context for AI assistants:

pmat context # Basic analysis pmat context --format llm-optimized # AI-optimized output pmat context --include-tests # Include test files

Technical Debt Grading (TDG)

Six orthogonal metrics for accurate quality assessment:

pmat analyze tdg # Project-wide grade pmat analyze tdg --include-components # Per-component breakdown pmat tdg baseline create # Create quality baseline pmat tdg check-regression # Detect quality degradation

Grading Scale:

  • A+/A: Excellent quality, minimal debt
  • B+/B: Good quality, manageable debt
  • C+/C: Needs improvement
  • D/F: Significant technical debt

Mutation Testing

Validate test suite effectiveness:

pmat mutate --target src/lib.rs # Single file pmat mutate --target src/ --threshold 85 # Quality gate pmat mutate --failures-only # CI optimization

Supported Languages: Rust, Python, TypeScript, JavaScript, Go, C++

Repository Health Scoring

Evidence-based quality metrics (0-211 scale):

pmat rust-project-score # Fast mode (~3 min) pmat rust-project-score --full # Comprehensive (~10-15 min) pmat repo-score . --deep # Full git history

Workflow Prompts

Pre-configured AI prompts enforcing EXTREME TDD:

pmat prompt --list # Available prompts pmat prompt code-coverage # 85%+ coverage enforcement pmat prompt debug # Five Whys analysis pmat prompt quality-enforcement # All quality gates

Git Hooks

Automatic quality enforcement:

pmat hooks install # Install pre-commit hooks pmat hooks install --tdg-enforcement # With TDG quality gates pmat hooks status # Check hook status

Examples

Generate Context for AI

# For Claude Code pmat context --output context.md --format llm-optimized # With semantic search pmat embed sync ./src pmat semantic search "error handling patterns"

CI/CD Integration

# .github/workflows/quality.yml name: Quality Gates on: [push, pull_request] jobs: quality: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: cargo install pmat - run: pmat analyze tdg --fail-on-violation --min-grade B - run: pmat mutate --target src/ --threshold 80

Quality Baseline Workflow

# 1. Create baseline pmat tdg baseline create --output .pmat/baseline.json # 2. Check for regressions pmat tdg check-regression \ --baseline .pmat/baseline.json \ --max-score-drop 5.0 \ --fail-on-regression

Architecture

pmat/
├── server/           CLI and MCP server
│   ├── src/
│   │   ├── cli/      Command handlers
│   │   ├── services/ Analysis engines
│   │   ├── mcp/      MCP protocol
│   │   └── tdg/      Technical Debt Grading
├── crates/
│   └── pmat-dashboard/  Pure WASM dashboard
└── docs/
    └── specifications/  Technical specs

Quality

| Metric | Value | |--------|-------| | Tests | 4600+ passing | | Coverage | >85% | | Mutation Score | >80% | | Languages | 17+ supported | | MCP Tools | 19 available |

Falsifiable Quality Commitments

Per Popper's demarcation criterion, all claims are measurable and testable:

| Commitment | Threshold | Verification Method | |------------|-----------|---------------------| | Context Generation | < 5 seconds for 10K LOC project | time pmat context on test corpus | | Memory Usage | < 500 MB for 100K LOC analysis | Measured via heaptrack in CI | | Test Coverage | ≥ 85% line coverage | cargo llvm-cov (CI enforced) | | Mutation Score | ≥ 80% killed mutants | pmat mutate --threshold 80 | | Build Time | < 3 minutes incremental | cargo build --timings | | CI Pipeline | < 15 minutes total | GitHub Actions workflow timing | | Binary Size | < 50 MB release binary | ls -lh target/release/pmat | | Language Parsers | All 17 languages parse without panic | Fuzz testing in CI |

How to Verify:

# Run self-assessment with Popper Falsifiability Score pmat popper-score --verbose # Individual commitment verification cargo llvm-cov --html # Coverage ≥85% pmat mutate --threshold 80 # Mutation ≥80% cargo build --timings # Build time <3min

Failure = Regression: Any commitment violation blocks CI merge.

Benchmark Results (Statistical Rigor)

All benchmarks use Criterion.rs with proper statistical methodology:

| Operation | Mean | 95% CI | Std Dev | Sample Size | |-----------|------|--------|---------|-------------| | Context (1K LOC) | 127ms | [124, 130] | ±12.3ms | n=1000 runs | | Context (10K LOC) | 1.84s | [1.79, 1.90] | ±156ms | n=500 runs | | TDG Scoring | 156ms | [148, 164] | ±18.2ms | n=500 runs | | Complexity Analysis | 23ms | [22, 24] | ±3.1ms | n=1000 runs |

Comparison Baselines (vs. Alternatives):

| Metric | PMAT | ctags | tree-sitter | Effect Size | |--------|------|-------|-------------|-------------| | 10K LOC parsing | 1.84s | 0.3s | 0.8s | d=0.72 (medium) | | Memory (10K LOC) | 287MB | 45MB | 120MB | - | | Semantic depth | Full | Syntax only | AST only | - |

See docs/BENCHMARKS.md for complete statistical analysis.

ML/AI Reproducibility

PMAT uses ML for semantic search and embeddings. All ML operations are reproducible:

Random Seed Management:

  • Embedding generation uses fixed seed (SEED=42) for deterministic outputs
  • Clustering operations use fixed seed (SEED=12345)
  • Seeds documented in docs/ml/REPRODUCIBILITY.md

Model Artifacts:

  • Pre-trained models from HuggingFace (all-MiniLM-L6-v2)
  • Model versions pinned in Cargo.toml
  • Hash verification on download

Dataset Sources

PMAT does not train models but uses these data sources for evaluation:

| Dataset | Source | Purpose | Size | |---------|--------|---------|------| | CodeSearchNet | GitHub/Microsoft | Semantic search benchmarks | 2M functions | | PMAT-bench | Internal | Regression testing | 500 queries |

Data provenance and licensing documented in docs/ml/REPRODUCIBILITY.md.

Sovereign Stack

PMAT is built on the PAIML Sovereign Stack - pure-Rust, SIMD-accelerated libraries:

| Library | Purpose | Version | |---------|---------|---------| | aprender | ML library (text similarity, clustering, topic modeling) | 0.24.0 | | trueno | SIMD compute library for matrix operations | 0.11.0 | | trueno-graph | GPU-first graph database (PageRank, Louvain, CSR) | 0.1.7 | | trueno-rag | RAG pipeline with VectorStore | 0.1.8 | | trueno-db | Embedded analytics database | 0.3.10 | | trueno-viz | Terminal graph visualization | 0.1.17 | | trueno-zram-core | SIMD LZ4/ZSTD compression (optional) | 0.3.0 | | pmat | Code analysis toolkit | 2.213.4 |

Key Benefits:

  • Pure Rust (no C dependencies, no FFI)
  • SIMD-first (AVX2, AVX-512, NEON auto-detection)
  • 2-4x speedup on graph algorithms via ap
五维分析
清晰度9/10
创新性8/10
实用性9/10
完整性8/10
可维护性9/10
优缺点分析

优点

  • 支持多种编程语言
  • 全面的分析功能
  • 与CI/CD工作流程良好集成
  • 高测试覆盖率和可靠性

缺点

  • 初学者可能有陡峭的学习曲线
  • 安装需要Rust工具链
  • 高级功能的文档有限
  • 在大型代码库中性能可能有所不同

相关技能

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

agno

S
toolCode Lib / 代码库
90/ 100

“它承诺成为智能体领域的Kubernetes,但得看开发者有没有耐心学习又一个编排层。”

nuxt-skills

S
toolCo-Pilot / 辅助式
90/ 100

“这本质上是一份组织良好的小抄,能把你的 AI 助手变成一只 Nuxt 框架的复读机。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 paiml.