AI agent for penetration testing. Like Claude Code, but for security. Open source, MCP-native, works with any LLM.
# Add to your Claude Code skills
git clone https://github.com/FrancescoStabile/numasecMost AI security tools still feel like one of two bad ideas.
The first is a scanner that dumps 500 findings and walks away.
The second is a chat wrapper that sounds confident and cannot actually do the work.
numasec is the third thing.
It is a security focused agent environment for the terminal. It can switch between specialist agents, keep engagement memory on disk, drive a real browser, send raw HTTP, run the tools already on your machine, and hand you back a scoped engagement instead of a pile of vibes.
Open numasec, run /pwn https://target, let it create the operation, pick the right play, route to the right agent, keep notes in numasec.md, and package the result when you are done.
If Claude Code is an engineer, numasec is an operator.
This is the shape of numasec right now: a terminal-native security stack with real agents, real workflows, and enough built-in machinery to feel like a small operating environment instead of a prompt wrapper.
| Layer | What is actually there |
|---|---|
| Primary agents | security, pentest, appsec, osint, hacking |
| Subagents | general, explore |
| Internal helpers | compaction, title, summary |
| Operation kinds | pentest, appsec, osint, hacking, bughunt, ctf, research |
| Built in plays | web-surface, network-surface, appsec-triage, , |
| Built in skills | , |
| Core palette | 30 built in tools spanning shell, files, browser, HTTP, scanner, crypto, net, methodology, CVE, remediation, and sharing |
No comments yet. Be the first to share your thoughts!
osint-targetctf-warmuppassive-osintforensics-kitThat is the part people immediately get when they see numasec for the first time: it is not one prompt with a terminal attached. It is a small security operating system.
security is the generalist. pentest is the engagement driver for recon, exploitation, and reporting. appsec is for code review and application assessment. osint is for intelligence gathering and forensics. hacking is for CTFs, exploit development, and reverse engineering.
Kinds and agents are intentionally not the same thing. A bughunt operation boots into pentest. A ctf operation boots into hacking. A research operation boots into security. That lets the workflow feel natural without multiplying agents just for branding.
general is the parallel worker. Hand it a noisy or multi step task and let it go wide without trashing the main thread.
explore is the fast codebase scout. It is tuned for finding files, tracing patterns, and answering "where does this live?" questions quickly.
Under the hood, numasec also uses helper agents for compaction, title generation, and summaries so long sessions stay usable.
The built in plays are not marketing props. They are actual reusable pipelines:
web-surface: crawl, inspect JavaScript, light dir fuzz, passive subdomain enumerationnetwork-surface: port scan, service probe, banner collectionappsec-triage: repository triage for vuln patterns and focus areasosint-target: passive target profiling for domains, emails, or handlesctf-warmup: quick artifact triage using the forensics skillThe two embedded skills ship in the binary and are always available:
passive-osint: subdomains, wayback, email and account enumeration without touching the targetforensics-kit: incident triage workflow for suspicious files or challenge artifactsYou can add more with your own SKILL.md files. numasec discovers them from skill directories and folds them into the agent context when needed.
The core palette is where numasec stops feeling like a demo.
It has normal file and code tools, but also the security primitives you actually want: bash, httprequest, browser, scanner, crypto, net, vault, interact, methodology, cve, play, pwn_bootstrap, doctor, opsec, share, remediate.
That means the agent can reason and act in the same place. Open Chromium. Replay an authenticated request. Crawl an app. Parse scope. Look up ATT&CK or PTES offline. Generate a handoff archive. Move from finding to patch.
If you already have nmap, sqlmap, ffuf, nuclei, gobuster, Burp, or your own binaries on PATH, numasec can drive those too through the shell.
Every real engagement in numasec becomes an Operation.
An operation is a real file on disk at .numasec/operation/<slug>/numasec.md. It is auto loaded as system instruction every time that engagement is active. Scope, target, findings, attempts, dead ends, todo items: the running agent reads the same notebook you do.
That one design choice changes the product completely. Close the laptop on Friday, come back on Monday, reopen the operation, and the agent is still standing in the same room.
The sidebar keeps the run readable while work is happening:
/opsec strict turns scope into an actual guardrail. HTTP, browser, and shell activity that falls outside declared scope gets blocked before it leaves the tool.
/pwn https://target classify target, create operation, pick play, start work
/operations switch between saved engagements
/agents open the agent picker
/mode pentest jump straight into a specialist
/play web-surface https://x run a reusable pipeline
/teach turn on narrated, tutorial style tool use
/doctor audit your local setup
/opsec strict lock the engagement to declared scope
/share --sign create a redacted handoff archive, optionally signed
/remediate OBS-001 turn an observation into patch advice
/review review the current repo like an appsec engineer
/models switch model without leaving the TUI
There is a lot more in the TUI, but those are the commands that tell the story fast.
It keeps memory on disk instead of pretending the context window is enough.
It separates specialist agents from operation kinds instead of forcing everything into one assistant voice.
It treats tools as first class primitives instead of accessories.
It can teach while it works. /teach turns the whole session into narrated operator mode, which is perfect for demos, live training, or recorded walkthroughs.
It works with the models you already use. Anthropic, OpenAI, Google, xAI, Bedrock, GitHub Models, OpenRouter, Ollama, and any OpenAI compatible endpoint can sit behind the same TUI.
npm i -g numasec
numasec
Docker:
docker run -it --rm -v "$PWD:/work" -w /work numasec/numasec:latest
From source:
git clone https://github.com/FrancescoStabile/numasec.git
cd numasec
bun install
cd packages/numasec
bun run build
numasec has its own browser, HTTP, scanner, crypto, net, CVE, and methodology tooling built in. It still gets better when your machine already has the usual security binaries installed.
# Debian / Kali / Ubuntu
apt install nmap sqlmap ffuf gobuster nikto
# macOS
brew install nmap sqlmap ffuf gobuster nikto
# Headless browser for the built in browser tool
npx playwright install chromium
Run /doctor any time. It checks runtime, workspace, vault mode, CVE bundle, and which external tools are present or missing.
numasec
Then:
/pwn http://localhost:3000
numasec classifies the target, creates and activates an operation, picks the matching play, and starts with the right default agent.
If you want persistent project context outside a specific operation, drop a numasec.md or .numasec.md file in the project root. numasec loads it automatically next time.
Example:
# Target: internal-api.corp.com
- Base: