by fcakyon
PhD Research Skills for Claude Code: paper reproduction, experiment design, paper review, result comparison and more.
# Add to your Claude Code skills
git clone https://github.com/fcakyon/phd-skillsLast scanned: 5/30/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-30T15:22:20.290Z",
"npmAuditRan": true,
"pipAuditRan": true
}No comments yet. Be the first to share your thoughts!
30 days in the Featured rail
Catch AI mistakes before they cost weeks of compute. Reproduce papers from arxiv. Debug runs evidence-first. Compare experiments at the right epoch. Launch with discipline.
Built by Fatih Cagatay Akyon (1500+ citations, 7 patents) after 300+ Claude Code sessions, tens of critical AI mistakes caught the hard way, and thousands of hours of PhD research. Every guardrail in this plugin traces to a real mistake.
Claude Code is powerful, but it makes research-specific mistakes that cost weeks of compute:
rm -rf on a path it had hallucinated from memory, lost local checkpointsOther plugins give you more commands. This plugin gives you guardrails.
claude plugin marketplace add fcakyon/phd-skills
claude plugin install phd-skills@phd-skills
The plugin works correctly the moment it is installed. Optional: run /phd-skills:setup for a 30-second tour of what was auto-detected and to opt into extras (notifications, allowlist, LaTeX).
Open Claude Code in your project directory, then:
/phd-skills:reproduce arxiv 2508.12345 reproduce a paper from arxiv URL through replication runs"why is my loss diverging?" the debug skill auto-triggers, runs evidence-first probes"compare run alpha to baseline" the compare skill auto-triggers, aligns at the same epoch"launch the new training run" the launch skill auto-triggers, runs the pre-flight checklist/loop 30m check experiment logs, notify me if metrics beat the baseline or if loss starts to divergeNotifications (task completion, background agents) forward to ntfy / Slack / email after /phd-skills:setup.
| Command | What it does |
| ----------------------------------------------------------- | ---------------------------------------------------------- |
| /phd-skills:xray | Audit paper against code and data (5 parallel dimensions) |
| /phd-skills:factcheck | Verify BibTeX entries and cited claims against DBLP |
| /phd-skills:gaps <topic> | Literature gap analysis with web confirmation |
| /phd-skills:fortify [venue] | Select strongest ablations + anticipate reviewer questions |
| /phd-skills:setup | Auto-detection tour + optional extras |
| /phd-skills:help | Show all features at a glance |
| When you say... | Skill activates | | ------------------------------------------------- | ----------------------------------------------------------------- | | "reproduce this arxiv paper" | Reproduce | | "why is X failing / diverging / OOMing" | Debug | | "compare run A to baseline" | Compare | | "launch a new training run" / "kick off training" | Launch | | "design an ablation study" | Experiment Design | | "find related papers on X" | Literature Research | | "check if my numbers match the code" | Paper Verification | | "review my methods section for consistency" | Paper Writing | | "analyze dataset bias" | Dataset Curation | | "prepare code for open-source release" | Research Publishing | | "what will reviewers ask about this?" | Reviewer Defense | | "setup latex for CVPR" | LaTeX Setup |
| Agent | What it does | Special |
| ------------------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------------------------------- |
| paper-auditor | Cross-checks paper claims vs code and data | Runs in isolated worktree, remembers patterns across sessions |
| experiment-analyzer | Analyzes results from wandb / neptune / tensorboard / mlflow / local | Hands off to compare and debug skills for discipline |
| What it catches | | ---------------------------------------------------------------------------------------------------------------------------- | | Conclusions reviewed against actual artifacts by a fresh-context research peer | | In-place edits to git-tracked source over SSH | | Unverified commands or paths in outbound teammate messages | | Project-internal jargon shapes in commits and docs | | Timezone tokens that do not match the system clock | | Pre-flight checklist on long ML training launches | | Fabricated paths in destructive commands (rm / mv / dd / force-push) | | Missing citation verification when editing .tex/.bib | | LaTeX compilation errors after .tex edits | | Unreviewed generated images/figures | | Research state loss before context overflow |
| | phd-skills | flonat/claude-research | Others | | ----------------------------------------- | --------------------------- | ---------------------------- | ------------ | | Commands to learn | 6 | 39 | 13-20 | | Research integrity hooks | 11 (agent + 10 auto-detect) | 1 | 0 | | Paper reproduction (arxiv to runs) | Yes (7-stage skill) | No | No | | Paper-code consistency audit | 5-dimension parallel | Read-only, no code cross-ref | None | | Experiment monitoring + SSH notifications | Yes (ntfy / slack / email) | No | No | | External dependencies | None | npm + pip + MCP servers | MCP required | | Install time | 30 seconds | 10+ minutes | Varies |
MIT. Use it, fork it, adapt it to your research.