nuke-on-rails

Name: nuke-on-rails
Author: nuke-on-rails

Pending

The Rails audit skill for AI coding agents, ranked by blast radius. 🚂☢️

52stars

0forks

Added 6/23/2026

Installation

# Add to your Claude Code skills
git clone https://github.com/nuke-on-rails/nuke-on-rails

Getting Started

Guides for using ai agents skills like nuke-on-rails.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

SKILL.md

README.md

Frequently Asked Questions

What is nuke-on-rails?

nuke-on-rails is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by nuke-on-rails. The Rails audit skill for AI coding agents, ranked by blast radius. 🚂☢️. It has 52 GitHub stars.

Is nuke-on-rails safe to use?

nuke-on-rails's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.

How do I install nuke-on-rails?

Clone the repository with "git clone https://github.com/nuke-on-rails/nuke-on-rails" and add it to your Claude Code skills directory (see the Installation section above). nuke-on-rails ships a SKILL.md manifest, so compatible agents can discover and load it automatically.

Are there alternatives to nuke-on-rails?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh nuke-on-rails against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

deepcloak Mythos-Claude-Orchestrator

name: nuke-on-rails description: Full health and security audit for Rails apps, the review a principal engineer would do. Runs rubycritic, Brakeman, bundler-audit and ruby_audit, triages every finding with the LLM as judge, brings an OWASP Top 10 arsenal of checks for what scanners miss, and returns one impact-ranked action plan. Use for a Rails project audit, a security and health check, "nuke on rails", or a deep review of a vibecoded Rails app. disable-model-invocation: true

Nuke on Rails

A full project health audit for Rails apps: what to refactor, what's vulnerable, and in what order to attack it. Three deterministic engines do the scanning; you are the judge. The output is one single list, prioritized by impact, never tool sections stapled together.

Respond in the user's language. Write the step announcements and the entire final report in whatever language the user writes in — and that means every label too: severity words (Critical/High/Medium/Low), section titles (SCOREBOARD, FIX NOW, FIX NEXT, BIGGEST STRUCTURAL MULTIPLIER, RULED OUT, NOTES), and finding fields (Problem/Solution/Files/Technical details). The English in these instructions and in templates/_OUTPUT.md is the authoring language only — never emit an English label into a non-English report. Only fixed tokens stay as-is: the brand NUKE ON RAILS, emoji, file paths, code identifiers, CVE ids.

This audit is a recon sweep — and it takes minutes, so announce each step as a dry field-radio line and the user watches the campaign advance. The conceit: you are the recon team scouting the codebase's own walls for the breaches a real attacker would use; the enemy is the debt and the holes, never the user. The lines below are authored in English — translate them to the user's language, one short line per step:

☢️ NUKE ON RAILS — operation underway
🔭 Recon — scouting the terrain (Rails? Ruby? git history?)…
🎖️ Mobilizing — arming & firing the engines…
🎯 Targets — marking where to concentrate fire (churn × complexity)…
⚔️ Probing the defenses — triaging N alerts, bringing the arsenal…
🧱 Inspecting the walls — reviewing the most battered files…
📋 Briefing the command — consolidating the field report…

The visible tool calls fill in the rest. The campaign voice lives in these announcements (and the TL;DR framing) only — it must never bleed into the findings, which stay sober and literal.

Step 0 — Detect the project

Rails app (config/application.rb exists, or the rails gem is in the Gemfile): run the full audit.
Plain Ruby (Gemfile or gemspec, no Rails): graceful degradation — rubycritic and bundler-audit run, Brakeman is skipped. Say so explicitly in the report header.
Neither: stop and tell the user this skill audits Ruby/Rails projects.

Ruby version trap. A .ruby-version (or Gemfile-pinned Ruby) often names a version not installed on this machine — common in unfamiliar or AI-generated repos, and it makes rubycritic and brakeman abort before producing output. The engines do static analysis and don't need the app's exact Ruby. If the pinned version is missing, run them under an available Ruby (e.g. set a temporary .ruby-version, or invoke via rbenv local <installed> / a system Ruby) rather than failing — and note the substitution in the report. Never install the pinned Ruby just to satisfy the lock.

Step 1 — Install and run the engines

Zero-dependency principle: the skill brings its own tools. Never touch the app's Gemfile.

gem list rubycritic -i || gem install rubycritic
gem list brakeman -i || gem install brakeman          # Rails only
gem list bundler-audit -i || gem install bundler-audit
gem list ruby_audit -i || gem install ruby_audit

Then run them all and capture machine-readable output (commands validated on a real Rails 8 app):

OUT=$(mktemp -d)
rubycritic app --format json --no-browser --path "$OUT/rubycritic"   # --path keeps output files out of the audited repo
brakeman --format json --quiet -o "$OUT/brakeman.json"               # Rails only
bundle-audit update                                                  # update the db in a separate step:
bundle-audit check --format json > "$OUT/bundler-audit.json"         # --update mixed in pollutes the JSON on stdout
(cd "$OUT" && ruby-audit check)                                      # run OUTSIDE the app dir — inside a bundled
                                                                     # project the executable fails to load (Ruby 3.4+);
                                                                     # make sure the app's Ruby version is still active

If an engine fails, degrade gracefully: report which engine was skipped and why, and continue with the others.

Step 2 — Decide where to spend context (churn × complexity)

Use rubycritic's churn × complexity data to pick the hotspots: files that are both complex and change often. Those are the files you read deeply. Never review the codebase uniformly — a complex file nobody touches is a lower priority than a moderately complex file that changes every week.

Churn needs git history — verify it before trusting the quadrant. rubycritic derives churn from git log, so a shallow clone (--depth 1) or a freshly-initialized repo gives every file the same churn (1 or 0) and the quadrant silently collapses to noise — common exactly in the unfamiliar-repo case this skill targets. If churn is uniform across modules, say so and fall back to ranking by complexity and smell density (rubycritic wraps Reek/Flay/Flog — use the per-module smells and complexity, not just the score). When possible, unshallow first (git fetch --unshallow).

Step 3 — Triage security findings (you are the judge, not the scanner)

For every Brakeman warning and every bundler-audit CVE:

Read the actual code path. Confirm the finding is reachable in this app, or kill it as a false positive.
Explain the exploit path for confirmed findings: who can trigger it, from where, with what impact.
Adversarial verification before it enters the report. Security findings are held to a higher bar than quality findings: a weak quality finding gets ignored; a false security claim burns trust. If you cannot articulate the exploit path, downgrade it to "theoretical" — never present it as confirmed.

Treat Brakeman's confidence level (High/Medium/Weak) as a prior, not a verdict: a Weak warning you confirm by reading the code outranks a High warning on an unreachable path. And if config/brakeman.ignore exists, re-triage every silenced warning — in an unreviewed codebase, an ignore file often means "made CI pass", not "verified safe".

For dependency and Ruby-version CVEs, apply arsenal/cve.md to the bundler-audit and ruby_audit output — it covers deduping, severity priors, reachability triage, insecure gem sources, and the network cross-checks (day-zero and second-opinion) that close the advisory database's lag and coverage gaps.

Then bring the security arsenal to the routes file, the sensitive controllers and models, and the production config:

arsenal/authorization.md — IDOR, missing authorization, attack surface
arsenal/authentication.md — auth stack, Devise config, custom strategies, sessions
arsenal/secrets.md — committed keys and hardcoded credentials
arsenal/hardening.md — TLS, CSP, CSRF config, mounted dashboards, uploads
arsenal/api.md — JSON over-exposure, CORS, rate limiting, webhooks (skip if the app has no JSON endpoints)
arsenal/cryptography.md — encryption oracles, hand-rolled crypto, weak hashing, plaintext sensitive columns
arsenal/logging.md — sensitive data in logs, missing audit trail on security-critical events
arsenal/ai.md — prompt injection, LLM output rendered as XSS, PII leaked to model APIs, over-powered tool-use (skip if the app makes no LLM calls)
arsenal/ci-cd.md — pipeline security in .github/workflows/ / .gitlab-ci.yml / Jenkinsfile: pull_request_target running fork code, untrusted ${{ }} in shells, unpinned actions, long-lived cloud keys (skip if the repo has no CI config)

They cover what Brakeman can't reach. The arsenal covers those areas; it does not guarantee them. Be explicit about that distinction in the report.

Step 4 — Quality review of the hotspots

Apply arsenal/code-quality.md — the thermo-nuclear standards, translated to the Rails idiom — to the hotspot files from Step 2. That check is the default quality bar for this skill: ambitious structural findings, not cosmetic nits.

Then apply arsenal/migrations.md to the recent migrations in db/migrate/ (read against db/schema.rb for table sizes) — the availability weapon no engine owns. A migration that locks or rewrites a large, traffic-heavy table, or drops/renames a column ahead of the code that uses it, is a scheduled outage, not a code smell: rank it accordingly.

Also apply arsenal/architecture.md to the app/ layer folders and the hotspots — dependency direction and cycles, which rubycritic's per-file scores miss: a model reaching up into the web layer, two namespaces that reference each other, or Zeitwerk name/path drift.

And apply arsenal/activerecord.md to app/models/ and the query-heavy hotspots — the correctness/integrity of the ActiveRecord calls themselves (a side effect in after_save racing the transaction, where(...).first with no order, has_many without dependent:), which the structural weapons don't judge.

And apply arsenal/jobs.md to app/jobs/ and the enqueue sites — background-job safety the engines can't model: a non-idempotent job that repeats its side effect on retry (double-charge), secrets/PII in job arguments (persisted in the queue store and shown on the dashboard), and records passed instead of ids.

Step 5 — The report (output)

This section governs what goes in the report and how it's judged — language, tone, ranking, and the closing summary. The visual structure it's rendered in — terminal-native plaintext, no markdown — lives in templates/_OUTPUT.md; the report must match that skeleton. Write the whole report in the user's language (see the top of this file).

Open with a short status banner — just write it, don't print a "Status banner" label. A few lines of orientation: project type and Rails/Ruby versions (note any substitution, e.g. running under a different Ruby), whether git history made the churn quadrant reliable, the engines that ran with headline counts, and one honest line on coverage (weapons cover, they don't guarantee). Open the TL;DR as a dry field report to the command, in keeping with the recon-campaign voice of the step announcements (English example, for tone only: "Field report: the position isn't on fire — but the back gate is unlocked, and that's where they get in."). Write it natively in the user's language so it actually lands; never translate the example literally. The campaign voice lives in the banner/TL;DR framing only; every finding stays sober and precise.

Right after the banner, open the body with the severity scoreboard — the triaged counts on one inline line, count before the label (never a markdown table; see templates/_OUTPUT.md):

🔴 1 Critical   🟠 4 High   🟡 6 Medium   🟢 5 Low

Dependency-risk totals and any end-of-life flag stay in the opening banner — don't repeat them here.

Then the findings, as one list ranked by impact:

Confirmed exploitable security findings (an IDOR in a payments controller outranks everything).
CVEs actually reachable from the app's usage.
High-churn × high-complexity quality hotspots (a fat model that changes weekly outranks a theoretical warning).
Theoretical security warnings that survived triage but lack a demonstrated exploit path.
Remaining quality findings, worst first.

Keep each finding tight, scannable, and in plain human language. Problem and Solution read for a non-expert stakeholder and an exhausted senior — everyday words, no method names or line refs. Lead with a one-line headline: severity + what it is, in plain language (no finding number, no effort estimate). Then the plain-language problem and the concrete fix. Push the technical specifics — method names, line refs, the reachable exploit path, and any churn/complexity metrics — into an optional Technical details field (see templates/_OUTPUT.md); a finding with nothing concrete to prove omits it. Show code only when a few lines make the point faster than words. No multi-paragraph exploit essays: a senior should grasp each finding in seconds and know the next move. When an issue recurs, cite the 2–5 strongest file:line locations and note "+ ~N similar", not every instance. Write the way Rails reads — friendly, direct, no ceremony. Don't print scaffolding labels ("Status banner", "Findings"); let the report flow.

Close with the plan so the reader leaves with a next move (the scoreboard already opened the body, up top):

Fix now — ranked by leverage: the most risk or debt removed for the least change, not impact alone. A confirmed critical and a high-impact quick win (a version bump, a one-line config) both belong here; a broad, high-risk structural change drops to Fix next even when impactful. Don't estimate time — how long a fix takes depends on the team and their tooling; judge by scope and blast radius, and describe that in words. Tiebreakers: anything that unblocks other work (a verification baseline, characterization tests) floats up; a high-confidence security finding floats above an equivalent-leverage non-security one; prefer fixes with a clean way to verify them. Each a tight, parallel line — subject then the move — the "if you do nothing else" list.
Fix next — the remaining high/medium, grouped tersely.
Biggest structural multiplier — one line naming the single refactor that removes the most risk or debt at once.

If the scanners raised things you ruled out, say so — but as one compact line near the end (name them, a one-word reason each: false positive / not reachable / dev-only / wrong target), never a verbose section. The reader should know the noise was checked, not missed.

The report is sensitive. It enumerates live, confirmed vulnerabilities and their exploit paths — it belongs with the people who can fix the app, not a public issue tracker or an open channel. If findings are published anywhere shared or public, redact the security specifics (exploit path, credential location) first.

The whole thing reads like a plan a principal engineer would hand you, not a tool dump.

One command. Every risk in your Rails app, ranked by impact.

What it is

Nuke on Rails is an open-source skill for AI coding agents (Claude Code, Cursor, Codex, and more), not a gem you add to your Gemfile. It audits your Rails app the way a principal engineer would: what to refactor, what's vulnerable, and in what order to fix it.

Instead of stapling separate tool reports together, it returns a single list, ranked by impact. An IDOR in your payments controller outranks a fat model; a high-churn fat model outranks a theoretical warning.

Scanners list problems. Nuke on Rails decides the order.

Quick Start

Nuke on Rails ships through the skills CLI. You'll need Node.js.

1. Install the skills CLI:

npm install -g skills

2. Add Nuke on Rails (from your project root):

skills add nuke-on-rails/nuke-on-rails

It works across agents: Claude Code, Cursor, Codex, Gemini CLI, Warp, and more.

3. Run it inside your agent:

/nuke-on-rails

Zero setup beyond that. It installs its own tools, detects Rails vs. plain Ruby, runs everything, and hands you the plan. It never touches your Gemfile.

4. Update when you want the latest checks and fixes:

skills update nuke-on-rails

Why not just ask the agent to "review my code"?

You can, and it'll find something. But "review my Rails app" gives a different, shallower answer every time and skips everything a deterministic scanner catches. The difference:

	Asking an agent to "review my code"	Nuke on Rails
Scanning	The model eyeballs whatever files it happens to read	Brakeman parses 100% of the AST; bundler-audit and ruby_audit check every locked gem
Reproducible	A different answer every run	Deterministic engines plus a fixed methodology
Where it looks	Wherever the model wanders, until context runs out	Churn × complexity picks the hotspots that actually matter
CVEs & EOL	Bounded by the training cutoff; can't know yesterday's CVE	Live advisory DB, day-zero web cross-checks, end-of-life detection
False positives	Confidently reports plausible-but-wrong issues	Every security finding adversarially verified; unprovable ones flagged "theoretical"
Coverage	Whatever it remembers to check that day	A fixed OWASP Top 10 arsenal, every run
Output	A wall of prose	One list ranked by impact, with a fix-now plan

The LLM still does the part it's good at: reading code paths, explaining exploits, judging severity. It just doesn't do it alone, from memory, and unprioritized.

What it catches

Coverage maps to the OWASP Top 10 2025. Each area is a weapon in the arsenal: a plain-markdown check the audit applies on top of the scanners.

Records loaded by id without ownership scoping (the canonical payments / orders / invoices case)
Authorization missing where authentication exists (logged-in is not allowed-to)
Mass assignment: permit!, role escalation, nested attributes, raw-Hash bypass
Records leaked through form dropdowns and serializers
Cross-tenant leaks in multi-tenant apps; routes exposing actions that shouldn't be public

Side effects in after_save that race the transaction (belongs in after_commit)
where(...).first with no order (non-deterministic results)
has_many without dependent:, and polymorphic associations with no FK integrity
count > 0 / present? for boolean checks; all.each over large tables

Prompt injection: user input or retrieved (RAG) content fed to the model as if it were instructions
LLM output rendered with raw / html_safe (stored XSS the scanners can't see), or piped into eval / SQL / system
PII and secrets sent into prompts to third-party model APIs without redaction
Over-powered tool/function-calling: SSRF, DB, or shell reach with no allowlist or human-in-the-loop
No rate or cost ceiling on LLM-backed endpoints (DoS and wallet-drain)

JSON over-exposure (render json: leaking token digests, role flags, PII)
Missing pagination (table dump and self-DoS); CORS wildcard with credentials; tokens in query strings
Exception leakage; unverified webhooks
GraphQL introspection and unbounded query depth/complexity
XXE and entity expansion
OAuth redirect_uri, state, and scope flaws

Dependency direction: a model or service reaching up into the web layer (params, render, *Controller)
Dependency cycles: two namespaces that reference each other and became one unit
One-shot object ceremony (Foo.new(x).call) and inconsistent component entry points
Zeitwerk name/path drift (billing/charge.rb not defining Billing::Charge)

Devise misconfig: user enumeration, no lockout, sessions that never expire, weak password policy
Session fixation and missing cookie flags (secure / httponly / SameSite)
Timing attacks and type-juggling on token and credential lookups
Tokens stored in plaintext or without expiry; rate-limit / throttle bypass
Custom Warden strategy bugs, scope confusion, impersonation gaps; JWT pitfalls (none alg, no expiry)

Non-idempotent jobs that repeat the side effect on retry (double-charge)
Secrets / PII in job arguments (persisted in the queue store, shown on the dashboard)
Records passed instead of ids (stale data, deserialization failures)

pull_request_target running a fork's code with the repo's secrets (RCE / exfiltration)
Untrusted ${{ }} (PR title, branch) interpolated into a run: shell (script injection)
Third-party actions pinned to a moving tag instead of a SHA (supply chain)
Long-lived cloud keys as CI secrets instead of OIDC; over-broad GITHUB_TOKEN permissions

Fat models
Callback-driven workflows
Rug concerns
Spaghetti branching
N+1 queries
The churn × complexity hotspots

force_ssl / HSTS off; backing-service traffic (Postgres, Redis) in cleartext
CSP missing or disabled; CSRF skipped on cookie-authenticated actions; host-header injection
Unauthenticated mounted dashboards (Sidekiq, PgHero, Flipper)
Debug / console gems shipped to production (a remote-code-execution surface)
Stack traces to users, unsafe uploads, stored XSS via markdown rendering
SSRF: a user-supplied URL fetched server-side (cloud metadata, internal services)

Encryption oracles (one crypto routine reused for trust tokens and user data)
Hand-rolled crypto instead of Rails primitives; static IVs; unauthenticated cipher modes
Weak password hashing (MD5/SHA); sensitive columns (CPF, SSN, bank, health) stored in plaintext

Known CVEs in your gems and in the Ruby version itself
JavaScript dependency advisories
Insecure or unpinned gem sources
End-of-life Ruby or Rails (a critical compliance finding even with zero open CVEs)

Sensitive data in logs (filter gaps, puts / logger dumps, unscrubbed error-tracker breadcrumbs)
PII sent to third-party and LLM calls
No audit trail on login, payment, privilege, and admin actions

Schema changes that lock or rewrite large tables (null: false without default, non-concurrent indexes, type changes, foreign keys validated in one shot)
Data backfills inside DDL migrations
Deploy-ordering hazards: columns dropped or renamed ahead of the code that uses them (expand/contract)
Irreversible, non-rollback-safe migrations; missing indexes on foreign