SkillsLLM
CategoriesBlogAI NewsAbout
HomeAI AgentsAuto-claude-code-research-in-sleep

Auto-claude-code-research-in-sleep

by wanshuiyin

Pending

ARIS ⚔️ (Auto-Research-In-Sleep) — Claude Code skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation via Codex MCP

298stars
32forks
Added 3/12/2026
View on GitHubDownload ZIP
AI Agentsai-researchai-toolsarisautonomous-agentclaudeclaude-codeclaude-code-skillscodexdeep-learninggptidea-generationllmmachine-learningmcpmcp-serverml-researchopenaipaper-reviewpaper-writingresearch-automation
Installation
# Add to your Claude Code skills
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep
README.md

Auto-claude-code-research-in-sleep (ARIS ⚔️)

中文版 README | English

Hero

Score Progression

🌙 Let Claude Code do research while you sleep. Wake up to find your paper scored, weaknesses identified, experiments run, and narrative rewritten — autonomously.

Custom Claude Code skills for autonomous ML research workflows. These skills orchestrate cross-model collaboration — Claude Code drives the research while an external LLM (via Codex MCP) acts as a critical reviewer. 🔀 Also supports alternative model combinations (e.g., GLM + GPT, GLM + MiniMax) — no Claude API required.

💭 Why not self-play with a single model? Using Claude Code subagents or agent teams for both execution and review is technically possible, but tends to fall into local minima — the same model reviewing its own patterns creates blind spots. Claude Code's strength is fast, fluid execution; Codex (GPT-5.4 xhigh) is slower but more deliberate and rigorous in critique. These complementary styles — speed × rigor — produce better outcomes than either model talking to itself.

📈 Score Progression (Real Run)

A real overnight 4-round run on an ML research project, from borderline reject to submission-ready:

| Round | Score | What Happened | |-------|-------|---------------| | Initial | 5.0/10 | Borderline reject | | Round 1 | 6.5/10 | Added standard metrics, discovered metric decoupling | | Round 2 | 6.8/10 | Key claim failed to reproduce, pivoted narrative | | Round 3 | 7.0/10 | Large seed study killed main improvement claim | | Round 4 | 7.5/10 ✅ | Diagnostic evidence solidified, submission ready |

The loop autonomously ran 20+ GPU experiments, rewrote the paper's narrative framing, and killed claims that didn't hold up — all without human intervention.

💡 Idea Discovery (New)

Don't have a concrete idea yet? Just give a research direction — /idea-creator handles the rest:

  1. 📚 Survey the landscape (recent papers, open problems, recurring limitations)
  2. 🧠 Brainstorm 8-12 concrete ideas via GPT-5.4 xhigh
  3. 🔍 Filter by feasibility, compute cost, and quick novelty search
  4. 🛡️ Validate top ideas with deep novelty check + devil's advocate review
  5. 🧪 Pilot top 2-3 ideas in parallel on different GPUs (30 min - 2 hr each)
  6. 🏆 Rank by empirical signal — ideas with positive pilot results rise to the top

The output is a ranked IDEA_REPORT.md with hypotheses, pilot results, reviewer objections, and a suggested execution order. Ideas that fail are documented too, saving future dead-end exploration.


🔄 Workflows

These skills compose into a full research lifecycle. The two workflows can be used independently or chained together:

  • Exploring a new area (e.g., writing a survey)? Start with Workfl...
Comments (0)
to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

n8n

by n8n-io

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
178,903
55,726
TypeScript
MCP Serversaiapis
View details
gemini-cli

by google-gemini

An open-source AI agent that brings the power of Gemini directly into your terminal.
97,469
12,195
TypeScript
AI Agentsaiai-agents
View details
everything-claude-code

by affaan-m

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
73,757
9,228
JavaScript
AI Agentsai-agentsanthropic
View details
context7

by upstash

Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors
48,800
2,303
TypeScript
MCP Serversllmmcp
View details
TrendRadar

by sansan0

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。
48,795
22,617
Python
MCP Serversaibark
View details
awesome-claude-skills

by ComposioHQ

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
43,513
4,373
Python
AI Agentsagent-skillsai-agents
View details