The Rails audit skill for AI coding agents, ranked by blast radius. 🚂☢️
# Add to your Claude Code skills
git clone https://github.com/nuke-on-rails/nuke-on-railsGuides for using ai agents skills like nuke-on-rails.
nuke-on-rails is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by nuke-on-rails. The Rails audit skill for AI coding agents, ranked by blast radius. 🚂☢️. It has 52 GitHub stars.
nuke-on-rails's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.
Clone the repository with "git clone https://github.com/nuke-on-rails/nuke-on-rails" and add it to your Claude Code skills directory (see the Installation section above). nuke-on-rails ships a SKILL.md manifest, so compatible agents can discover and load it automatically.
Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh nuke-on-rails against similar tools.
No comments yet. Be the first to share your thoughts!
Unlocks once the catalog security scan passes (runs nightly).
The deep catalog scan for this skill is still queued. Run an instant dependency check now instead.
A full project health audit for Rails apps: what to refactor, what's vulnerable, and in what order to attack it. Three deterministic engines do the scanning; you are the judge. The output is one single list, prioritized by impact, never tool sections stapled together.
Respond in the user's language. Write the step announcements and the entire final report in whatever language the user writes in — and that means every label too: severity words (Critical/High/Medium/Low), section titles (SCOREBOARD, FIX NOW, FIX NEXT, BIGGEST STRUCTURAL MULTIPLIER, RULED OUT, NOTES), and finding fields (Problem/Solution/Files/Technical details). The English in these instructions and in templates/_OUTPUT.md is the authoring language only — never emit an English label into a non-English report. Only fixed tokens stay as-is: the brand NUKE ON RAILS, emoji, file paths, code identifiers, CVE ids.
This audit is a recon sweep — and it takes minutes, so announce each step as a dry field-radio line and the user watches the campaign advance. The conceit: you are the recon team scouting the codebase's own walls for the breaches a real attacker would use; the enemy is the debt and the holes, never the user. The lines below are authored in English — translate them to the user's language, one short line per step:
☢️ NUKE ON RAILS — operation underway🔭 Recon — scouting the terrain (Rails? Ruby? git history?)…🎖️ Mobilizing — arming & firing the engines…🎯 Targets — marking where to concentrate fire (churn × complexity)…⚔️ Probing the defenses — triaging N alerts, bringing the arsenal…🧱 Inspecting the walls — reviewing the most battered files…📋 Briefing the command — consolidating the field report…The visible tool calls fill in the rest. The campaign voice lives in these announcements (and the TL;DR framing) only — it must never bleed into the findings, which stay sober and literal.
config/application.rb exists, or the rails gem is in the Gemfile): run the full audit.Ruby version trap. A .ruby-version (or Gemfile-pinned Ruby) often names a version not installed on this machine — common in unfamiliar or AI-generated repos, and it makes rubycritic and brakeman abort before producing output. The engines do static analysis and don't need the app's exact Ruby. If the pinned version is missing, run them under an available Ruby (e.g. set a temporary .ruby-version, or invoke via rbenv local <installed> / a system Ruby) rather than failing — and note the substitution in the report. Never install the pinned Ruby just to satisfy the lock.
Zero-dependency principle: the skill brings its own tools. Never touch the app's Gemfile.
gem list rubycritic -i || gem install rubycritic
gem list brakeman -i || gem install brakeman # Rails only
gem list bundler-audit -i || gem install bundler-audit
gem list ruby_audit -i || gem install ruby_audit
Then run them all and capture machine-readable output (commands validated on a real Rails 8 app):
OUT=$(mktemp -d)
rubycritic app --format json --no-browser --path "$OUT/rubycritic" # --path keeps output files out of the audited repo
brakeman --format json --quiet -o "$OUT/brakeman.json" # Rails only
bundle-audit update # update the db in a separate step:
bundle-audit check --format json > "$OUT/bundler-audit.json" # --update mixed in pollutes the JSON on stdout
(cd "$OUT" && ruby-audit check) # run OUTSIDE the app dir — inside a bundled
# project the executable fails to load (Ruby 3.4+);
# make sure the app's Ruby version is still active
If an engine fails, degrade gracefully: report which engine was skipped and why, and continue with the others.
Use rubycritic's churn × complexity data to pick the hotspots: files that are both complex and change often. Those are the files you read deeply. Never review the codebase uniformly — a complex file nobody touches is a lower priority than a moderately complex file that changes every week.
Churn needs git history — verify it before trusting the quadrant. rubycritic derives churn from git log, so a shallow clone (--depth 1) or a freshly-initialized repo gives every file the same churn (1 or 0) and the quadrant silently collapses to noise — common exactly in the unfamiliar-repo case this skill targets. If churn is uniform across modules, say so and fall back to ranking by complexity and smell density (rubycritic wraps Reek/Flay/Flog — use the per-module smells and complexity, not just the score). When possible, unshallow first (git fetch --unshallow).
For every Brakeman warning and every bundler-audit CVE:
Treat Brakeman's confidence level (High/Medium/Weak) as a prior, not a verdict: a Weak warning you confirm by reading the code outranks a High warning on an unreachable path. And if config/brakeman.ignore exists, re-triage every silenced warning — in an unreviewed codebase, an ignore file often means "made CI pass", not "verified safe".
For dependency and Ruby-version CVEs, apply arsenal/cve.md to the bundler-audit and ruby_audit output — it covers deduping, severity priors, reachability triage, insecure gem sources, and the network cross-checks (day-zero and second-opinion) that close the advisory database's lag and coverage gaps.
Then bring the security arsenal to the routes file, the sensitive controllers and models, and the production config:
arsenal/authorization.md — IDOR, missing authorization, attack surfacearsenal/authentication.md — auth stack, Devise config, custom strategies, sessionsarsenal/secrets.md — committed keys and hardcoded credentialsarsenal/hardening.md — TLS, CSP, CSRF config, mounted dashboards, uploadsarsenal/api.md — JSON over-exposure, CORS, rate limiting, webhooks (skip if the app has no JSON endpoints)arsenal/cryptography.md — encryption oracles, hand-rolled crypto, weak hashing, plaintext sensitive columnsarsenal/logging.md — sensitive data in logs, missing audit trail on security-critical eventsarsenal/ai.md — prompt injection, LLM output rendered as XSS, PII leaked to model APIs, over-powered tool-use (skip if the app makes no LLM calls)arsenal/ci-cd.md — pipeline security in .github/workflows/ / .gitlab-ci.yml / Jenkinsfile: pull_request_target running fork code, untrusted ${{ }} in shells, unpinned actions, long-lived cloud keys (skip if the repo has no CI config)They cover what Brakeman can't reach. The arsenal covers those areas; it does not guarantee them. Be explicit about that distinction in the report.
Apply arsenal/code-quality.md — the thermo-nuclear standards, translated to the Rails idiom — to the hotspot files from Step 2. That check is the default quality bar for this skill: ambitious structural findings, not cosmetic nits.
Then apply arsenal/migrations.md to the recent migrations in db/migrate/ (read against db/schema.rb for table sizes) — the availability weapon no engine owns. A migration that locks or rewrites a large, traffic-heavy table, or drops/renames a column ahead of the code that uses it, is a scheduled outage, not a code smell: rank it accordingly.
Also apply arsenal/architecture.md to the app/ layer folders and the hotspots — dependency direction and cycles, which rubycritic's per-file scores miss: a model reaching up into the web layer, two namespaces that reference each other, or Zeitwerk name/path drift.
And apply arsenal/activerecord.md to app/models/ and the query-heavy hotspots — the correctness/integrity of the ActiveRecord calls themselves (a side effect in after_save racing the transaction, where(...).first with no order, has_many without dependent:), which the structural weapons don't judge.
And apply arsenal/jobs.md to app/jobs/ and the enqueue sites — background-job safety the engines can't model: a non-idempotent job that repeats its side effect on retry (double-charge), secrets/PII in job arguments (persisted in the queue store and shown on the dashboard), and records passed instead of ids.
This section governs what goes in the report and how it's judged — language, tone, ranking, and the closing summary. The visual structure it's rendered in — terminal-native plaintext, no markdown — lives in templates/_OUTPUT.md; the report must match that skeleton. Write the whole report in the user's language (see the top of this file).
Open with a short status banner — just write it, don't print a "Status banner" label. A few lines of orientation: project type and Rails/Ruby versions (note any substitution, e.g. running under a different Ruby), whether git history made the churn quadrant reliable, the engines that ran with headline counts, and one honest line on coverage (weapons cover, they don't guarantee). Open the TL;DR as a dry field report to the command, in keeping with the recon-campaign voice of the step announcements (English example, for tone only: "Field report: the position isn't on fire — but the back gate is unlocked, and that's where they get in."). Write it natively in the user's language so it actually lands; never translate the example literally. The campaign voice lives in the banner/TL;DR framing only; every finding stays sober and precise.
Right after the banner, open the body with the severity scoreboard — the triaged counts on one inline line, count before the label (never a markdown table; see templates/_OUTPUT.md):
🔴 1 Critical 🟠 4 High 🟡 6 Medium 🟢 5 Low
Dependency-risk totals and any end-of-life flag stay in the opening banner — don't repeat them here.
Then the findings, as one list ranked by impact:
Keep each finding tight, scannable, and in plain human language. Problem and Solution read for a non-expert stakeholder and an exhausted senior — everyday words, no method names or line refs. Lead with a one-line headline: severity + what it is, in plain language (no finding number, no effort estimate). Then the plain-language problem and the concrete fix. Push the technical specifics — method names, line refs, the reachable exploit path, and any churn/complexity metrics — into an optional Technical details field (see templates/_OUTPUT.md); a finding with nothing concrete to prove omits it. Show code only when a few lines make the point faster than words. No multi-paragraph exploit essays: a senior should grasp each finding in seconds and know the next move. When an issue recurs, cite the 2–5 strongest file:line locations and note "+ ~N similar", not every instance. Write the way Rails reads — friendly, direct, no ceremony. Don't print scaffolding labels ("Status banner", "Findings"); let the report flow.
Close with the plan so the reader leaves with a next move (the scoreboard already opened the body, up top):
If the scanners raised things you ruled out, say so — but as one compact line near the end (name them, a one-word reason each: false positive / not reachable / dev-only / wrong target), never a verbose section. The reader should know the noise was checked, not missed.
The report is sensitive. It enumerates live, confirmed vulnerabilities and their exploit paths — it belongs with the people who can fix the app, not a public issue tracker or an open channel. If findings are published anywhere shared or public, redact the security specifics (exploit path, credential location) first.
The whole thing reads like a plan a principal engineer would hand you, not a tool dump.
Nuke on Rails is an open-source skill for AI coding agents (Claude Code, Cursor, Codex, and more), not a gem you add to your Gemfile. It audits your Rails app the way a principal engineer would: what to refactor, what's vulnerable, and in what order to fix it.
Instead of stapling separate tool reports together, it returns a single list, ranked by impact. An IDOR in your payments controller outranks a fat model; a high-churn fat model outranks a theoretical warning.
Scanners list problems. Nuke on Rails decides the order.
Nuke on Rails ships through the skills CLI. You'll need Node.js.
1. Install the skills CLI:
npm install -g skills
2. Add Nuke on Rails (from your project root):
skills add nuke-on-rails/nuke-on-rails
It works across agents: Claude Code, Cursor, Codex, Gemini CLI, Warp, and more.
3. Run it inside your agent:
/nuke-on-rails
Zero setup beyond that. It installs its own tools, detects Rails vs. plain Ruby, runs everything, and hands you the plan. It never touches your Gemfile.
4. Update when you want the latest checks and fixes:
skills update nuke-on-rails
You can, and it'll find something. But "review my Rails app" gives a different, shallower answer every time and skips everything a deterministic scanner catches. The difference:
| Asking an agent to "review my code" | Nuke on Rails | |
|---|---|---|
| Scanning | The model eyeballs whatever files it happens to read | Brakeman parses 100% of the AST; bundler-audit and ruby_audit check every locked gem |
| Reproducible | A different answer every run | Deterministic engines plus a fixed methodology |
| Where it looks | Wherever the model wanders, until context runs out | Churn × complexity picks the hotspots that actually matter |
| CVEs & EOL | Bounded by the training cutoff; can't know yesterday's CVE | Live advisory DB, day-zero web cross-checks, end-of-life detection |
| False positives | Confidently reports plausible-but-wrong issues | Every security finding adversarially verified; unprovable ones flagged "theoretical" |
| Coverage | Whatever it remembers to check that day | A fixed OWASP Top 10 arsenal, every run |
| Output | A wall of prose | One list ranked by impact, with a fix-now plan |
The LLM still does the part it's good at: reading code paths, explaining exploits, judging severity. It just doesn't do it alone, from memory, and unprioritized.
Coverage maps to the OWASP Top 10 2025. Each area is a weapon in the arsenal: a plain-markdown check the audit applies on top of the scanners.
permit!, role escalation, nested attributes, raw-Hash bypassafter_save that race the transaction (belongs in after_commit)where(...).first with no order (non-deterministic results)has_many without dependent:, and polymorphic associations with no FK integritycount > 0 / present? for boolean checks; all.each over large tablesraw / html_safe (stored XSS the scanners can't see), or piped into eval / SQL / systemrender json: leaking token digests, role flags, PII)redirect_uri, state, and scope flawsparams, render, *Controller)Foo.new(x).call) and inconsistent component entry pointsbilling/charge.rb not defining Billing::Charge)secure / httponly / SameSite)none alg, no expiry)pull_request_target running a fork's code with the repo's secrets (RCE / exfiltration)${{ }} (PR title, branch) interpolated into a run: shell (script injection)GITHUB_TOKEN permissionsforce_ssl / HSTS off; backing-service traffic (Postgres, Redis) in cleartextputs / logger dumps, unscrubbed error-tracker breadcrumbs)null: false without default, non-concurrent indexes, type changes, foreign keys validated in one shot)