mantishack is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by deonmenezes. Mantis Hack. It has 383 GitHub stars.

Is mantishack safe to use?

Yes. mantishack passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.

How do I install mantishack?

Clone the repository with "git clone https://github.com/deonmenezes/mantishack" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is mantishack written in?

mantishack is primarily written in Rust. It is open-source under deonmenezes on GitHub, so you can review or fork the full source.

Are there alternatives to mantishack?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh mantishack against similar tools.

Mantishack

Name: mantishack
Author: deonmenezes

stalk - wait - strike - hold Ethically hack and discover vulnerabilities in any software with the power of AI.

mantishack.com

What is Mantishack?

Mantishack is Mantis AI: an autonomous vulnerability-discovery agent built on top of OpenAI's Codex CLI (Rust, Apache-2.0), rebranded and wired end-to-end as an offensive-AppSec harness. It runs a staged detect-then-validate pipeline over a codebase - recon, detect, reachability, attacker-simulation validation, chaining, gated exploitation, fixing, and reporting - with every finding owned by a tool, not tracked in prose.

The core bet is the same one this project has always made: detection is commodity, validation precision is the product. What survives attacker-simulation is real; everything else gets rejected with a cited roadblock.

It is not polished software. It is a working harness with real gaps (see "Project history" and MANTIS.md for what's wired vs. what still needs external binaries or a running target), usable in the field, rough in the corners.

Quick start

Build from source

# Clone the repo
git clone https://github.com/deonmenezes/mantishack.git
cd mantishack

# Build the CLI (Rust toolchain required, see codex-rs/rust-toolchain.toml)
cd codex-rs
cargo build --release -p codex-cli

# Run it from the repo root so the project-scoped .codex/config.toml resolves
cd ..
./codex-rs/target/release/codex

Install the capability-catalog binaries

The MCP tool servers under .codex/mcp-servers/ are pure Node.js and need no install of their own, but the scanners they wrap degrade to available: false until you install the underlying binaries:

pip install semgrep bandit
brew install trufflehog trivy z3 ast-grep    # or your platform's equivalent
go install github.com/google/osv-scanner/v2/cmd/osv-scanner@latest
# CodeQL CLI: https://github.com/github/codeql-cli-binaries

Nothing is fabricated when a binary is missing - each server reports plainly that the tool isn't installed rather than inventing findings.

What's wired

Layer	Where	What it does
Capability servers (MCP)	`.codex/mcp-servers/`	`semgrep_scan`, `codeql_create_database`/`codeql_analyze`, `osv_scan`, `trufflehog_scan`, `bandit_scan`, `trivy_scan` - SAST/SCA/secrets, all report-until-installed.
Program-analysis substrate	`.codex/mcp-servers/program-analysis/`	`source_sink_scan` (heuristic, works now), `ast_grep_scan` (structural search), `smt_check_reachability` (z3-backed path-condition satisfiability).
Findings spine	`.codex/mcp-servers/findings/`	Tool-owned finding state: `finding_create/update/get/list`. Enforces "no proof -> no confirm" and "reject cites a roadblock" in code, not prose. Works now, zero deps.
Evidence	`.codex/mcp-servers/http-audit/`	Turns a captured HTTP exchange into a bounded, redacted evidence pack with a stable request-ref hash. Works now, zero deps.
Injection defense	`.codex/mcp-servers/canary/`	Decoy tools that alert if called - a tripwire for prompt injection or hallucinated tool use. Works now, zero deps.
Agent catalog	`.codex/agents/*.toml`	The full pipeline as `spawn_agent` roles: `recon`, `context-enrich`, `detector`, `reachability`, `validator`, `verifier-balanced`/`brutalist`/`final`, `chain-builder`, `exploiter` (gated), `fixer`, `reporter`, `orchestrator`.
Knowledge	`.codex/skills/*/SKILL.md`	Playbooks tying each tool/agent into the findings lifecycle: `mantis-pipeline` (master playbook), `semgrep-triage`, `codeql-audit`, `osv-dependency-scan`, `secrets-scan`, `detection-breadth`, `program-analysis`, `findings-spine`, `http-evidence`, `canary-tripwire-response`.
Identity	`codex-rs/core/*_prompt.md`, `codex-rs/protocol/`, `codex-rs/models-manager/`	The agent's system prompts, rebranded to Mantis AI's authorized-vulnerability-discovery mission and safety contract.
TUI	`codex-rs/tui/`	Mantis mascot ASCII boot animation (with a blink) and a green accent theme.

See MANTIS.md for the full map, the "how to extend" recipes, and the prioritized roadmap of what's next (recon/DAST toolchain, injection confirmers, OOB, fuzzing, CVE intel - each a report-until-installed MCP server on the existing pattern).

How the pipeline works

Findings move through one lifecycle, owned by the findings service, never hand-edited in prose:

candidate -> confirmed | rejected -> exploited -> fixed -> verified

Detect is generous. SAST/SCA/secrets scanners plus LLM reasoning surface candidates at high recall. Do not self-censor false positives here - that's Validate's job.
Validate is ruthless. Attacker-simulation reasons as an attacker with attacker-only capabilities: is the pattern real or noise, what does an attacker need to reach it, does the path actually exist, can it be reached from outside. A candidate only becomes confirmed with reachability evidence attached - the findings service refuses the transition otherwise. A rejected candidate must cite the specific roadblock (auth gate, sanitizer at the sink, provably-unreachable path, self-only harm) - "seems safe" isn't accepted.
Chain, exploit, fix, verify follow only for confirmed findings, with exploitation double-gated and off by default.

Establish scope and authorization first. Work only on targets you own or are explicitly authorized to test. If authorization for active/exploit testing is unclear, restrict the run to read-only static analysis until scope is established.

Z3 SMT integration

The program-analysis MCP server wraps z3 for reachability (smt_check_reachability): construct the path condition for a candidate source-to-sink flow as an SMT-LIB2 script, hand it to the tool, and get back sat (an attacker-controlled assignment reaches the sink - proceed to Validate), unsat (provably unreachable - reject, citing the unsat result as the roadblock), or unknown (proves nothing either way). It degrades to available: false if z3 isn't installed, same as every other scanner server.

Using a different model

Mantis AI inherits Codex CLI's provider-agnostic model routing - configure providers, API keys, and per-model settings the same way you would for upstream Codex CLI (see codex-rs/config.md and codex-rs/model-provider/). There's no separate "orchestration vs. analysis" split: one model runs the harness loop, calling the MCP tools and (when multi-agent mode is enabled) spawning the role agents in .codex/agents/.

Architecture

The invariant that governs everything (also documented in MANTIS.md):

Tool = code - an MCP server handler that does something. Wired in .codex/config.toml [mcp_servers.*], implemented under .codex/mcp-servers/<name>/server.js.
Agent = prompt - a system prompt + model tier + tool/permission policy. A TOML role file under .codex/agents/<name>.toml, auto-discovered and offered as a spawn_agent agent type.
Skill = knowledge - reference text the model reads. .codex/skills/<name>/SKILL.md.

The harness (Codex CLI, Rust - codex-rs/) runs the loop; it does zero security by itself. All actual security logic lives in the capability layer, the agent prompts, and the skills.

codex-rs/                the Codex CLI harness (sessions, sandboxing, TUI, MCP client)
.codex/config.toml        project-scoped MCP server registration
.codex/mcp

mantishack

Frequently Asked Questions

What is mantishack?