by HeadyZhang
Static security scanner for LLM agents — prompt injection, MCP config auditing, taint analysis. 49 rules mapped to OWASP Agentic Top 10 (2026). Works with LangChain, CrewAI, AutoGen.
# Add to your Claude Code skills
git clone https://github.com/HeadyZhang/agent-auditFind security vulnerabilities in your AI agent code before they reach production.
AI agents are not just chatbots. They execute code, call tools, and touch real systems, so one unsafe input path can become a production incident.
subprocess/eval and become command executionIf your team ships agent features, owns CI security gates, or operates MCP servers and tool integrations, this is a high-probability risk surface rather than an edge case. You likely need this before every merge if agent code can trigger tools, commands, or external systems.
No comments yet. Be the first to share your thoughts!
Agent Audit catches these issues before deployment with an analysis core designed for agent workflows today: tool-boundary taint tracking, MCP configuration auditing, and semantic secret detection, with room to extend into learning-assisted detection over time.
Think of it as security linting for AI agents, with 53 rules mapped to the OWASP Agentic Top 10 (2026).
pip install agent-audit
agent-audit scan ./your-agent-project
# Show only high+ findings
agent-audit scan . --severity high
# Fail CI when high+ findings exist
agent-audit scan . --fail-on high
--severity controls what is reported. --fail-on controls when the command exits with code 1.
Sample report output:
╭──────────────────────────────────────────────────────────────────────────────╮
│ Agent Audit Security Report │
│ Scanned: ./your-agent-project │
│ Files analyzed: 2 │
│ Risk Score: 8.4/10 (HIGH) │
╰──────────────────────────────────────────────────────────────────────────────╯
BLOCK -- Tier 1 (Confidence >= 90%) -- 16 findings
AGENT-001: Command Injection via Unsanitized Input
Location: agent.py:21
Code: result = subprocess.run(command, shell=True, capture_output=True, text=True)
AGENT-010: System Prompt Injection Vector in User Input Path
Location: agent.py:13
Code: system_prompt = f"You are a helpful {user_role} assistant..."
AGENT-041: SQL Injection via String Interpolation
Location: agent.py:31
Code: cursor.execute(f"SELECT * FROM users WHERE name = '{query}'")
AGENT-031: Mcp Sensitive Env Exposure
Location: mcp_config.json:1
Code: env: {"API_KEY": "sk-a***"}
... and 15 more
Summary:
BLOCK: 16 | WARN: 2 | INFO: 1
Risk Score: =========================----- 8.4/10 (HIGH)
Validation snapshot (as of 2026-02-19, v0.16 benchmark set): 94.6% recall, 87.5% precision, 0.91 F1, with 10/10 OWASP Agentic Top 10 coverage across 9 open-source targets.
Details: Benchmark Results | Competitive Comparison
| Category | What goes wrong | Example rule |
|----------|----------------|--------------|
| Injection attacks | User input flows to exec(), subprocess, SQL | AGENT-001, AGENT-041 |
| Prompt injection | User input concatenated into system prompts | AGENT-010 |
| Leaked secrets | API keys hardcoded in source or MCP config | AGENT-004, AGENT-031 |
| Missing input validation | @tool functions accept raw strings without checks | AGENT-034 |
| Unsafe MCP servers | No auth, no version pinning, overly broad permissions | AGENT-005, AGENT-029, AGENT-030, AGENT-033 |
| MCP tool poisoning | Hidden instructions or data exfiltration in tool descriptions | AGENT-056, AGENT-057 |
| MCP tool shadowing | Multiple servers register identical tool names to override behavior | AGENT-055 |
| MCP rug pull / drift | Server tools change after initial security audit | AGENT-054 |
| No guardrails | Agent runs without iteration limits or human approval | AGENT-028, AGENT-037 |
| Unrestricted code execution | Tools run eval() or shell=True without sandboxing | AGENT-035 |
| Source map leakage | Debug artifacts (.map, .pdb) included in published agent packages | AGENT-110 |
| Sub-agent privilege escalation | Child agents inherit parent's full tool set without restriction | AGENT-112 |
| Delegation without auth | Cross-agent delegation without identity verification | AGENT-113 |
| Auto-approve all tools | Agent auto-approves tool execution without safety classification | AGENT-117 |
| HITL bypass | Human-in-the-loop approval bypassed via delegation or self-modification | AGENT-118 |
| Trace suppression | AI attribution removed from git commits, logs, or outputs | AGENT-119 |
| Config hooks poisoning | Malicious hooks in .claude/settings.json, .cursor/, .mcp.json (CVE-2025-59536) | AGENT-120 |
Full coverage of all 10 OWASP Agentic Security categories. Framework-specific detection for LangChain, CrewAI, AutoGen, and AgentScope. See all rules ->
Agent Audit is available as an OpenClaw skill on ClawHub:
npx clawhub@latest install agent-audit-scanner
Once installed, ask your OpenClaw agent:
The scanner covers all 10 OWASP Agentic AI threat categories and has been validated against 18,899 ClawHub skills at 80% precision.
mcp.json / claude_desktop_config.json for secrets, auth gaps, and supply chain risks# Scan a project
agent-audit scan ./my-agent
# JSON output for scripting
agent-audit scan ./my-agent --format json
# SARIF output for GitHub Code Scanning
agent-audit scan . --format sarif --output results.sarif
# Only fail CI on critical findings
agent-audit scan . --fail-on critical
# Inspect a live MCP server (read-only, never calls tools)
agent-audit inspect stdio -- npx -y @modelcontextprotocol/server-filesystem /tmp
Track only new findings across commits:
# Save current state as baseline
agent-audit scan . --save-baseline baseline.json
# Only report new findings not in baseline
agent-audit scan . --baseline baseline.json --fail-on-new
name: Agent Security Scan
on: [push, pull_request]
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: HeadyZhang/agent-audit@v1
with:
path: '.'
fail-on: 'high'
upload-sarif: 'true'
| Input | Description | Default |
|-------|-------------|---------|
| path | Path to scan | . |
| format | Output format: terminal, json, sarif, markdown | sarif |
| severity | Minimum severity to report | low |
| fail-on | Exit with error at this severity | high |
| baseline | Baseline file for incremental scanning | - |
| upload-sarif | Upload SARIF to GitHub Security tab | true |
Evaluated on Agent-Vuln-Bench (19 samples across 3 vulnerability categories), compared against Bandit and Semgrep:
| Tool | Recall | Precision | F1 | |------|-------:|----------:|---:| | agent-audit | 94.6% | 87.5% | 0.91 | | Bandit 1.8 | 29.7% | 100% | 0.46 | | Semgrep 1.x | 27.0% | 100% | 0.43 |
| Category | agent-audit | Bandit | Semgrep | |----------|:-----------:|:-----:|:-------:| | Set A -- Injection / RCE | 100% | 68.8% | 56.2% | | Set B -- MCP Configuration | 100% | 0% | 0% | | Set C -- Data / Auth | 84.6% | 0% | 7.7% |
Neither Bandit nor Semgrep can parse MCP configuration files -- they achieve 0% recall on agent-specific configuration vulnerabilities (Set B).
Full evaluation details: Benchmark Results | Competitive Comparison
Source Files (.py, .json, .yaml, .env, ...)
|
+-- PythonScanner ---- AST Analysis ---- Dangerous Patterns
| | Tool Metadata
| +-- TaintTracker --------------- Source->Sink Reachability