AutoR takes a research goal, runs a fixed 8-stage pipeline with Claude Code, and requires explicit human approval after every stage before the workflow can continue.
# Add to your Claude Code skills
git clone https://github.com/AutoX-AI-Labs/AutoRAutoR is not a chat demo, not a generic agent framework, and not a markdown-only research toy.
It is a research execution loop: goal -> literature -> hypothesis -> design -> implementation -> experiments -> analysis -> paper -> dissemination, with explicit human control at every stage and real artifacts on disk.
Most AI research demos stop at "the model wrote a plausible summary."
AutoR is built around a harder standard: the system should leave behind a run directory that another person can inspect, resume, audit, and critique.
| AutoR does | Why it matters |
| --- | --- |
| Fixed 8-stage research workflow | The system behaves like a real research process instead of a free-form chat loop. |
| Mandatory human approval after every stage | AI executes; humans retain control at high-leverage decision points. |
| Full run isolation under runs/<run_id>/ | Prompts, logs, stage outputs, code, figures, and papers are all auditable. |
| Draft -> validate -> promote for stage summaries | Half-finished summaries do not silently become official stage records. |
| Artifact-aware validation | Later stages must produce data, results, figures, LaTeX, PDF, and review assets, not just prose. |
| Resume and redo-stage support | Long runs are recoverable and partially repeatable. |
| Stage-local conversation continuation | Refinement improves the current stage instead of constantly resetting context. |
| Venue-aware writing stage | Stage 07 can target lightweight conference or journal-style paper packaging without pretending to be a full submission system. |
No comments yet. Be the first to share your thoughts!
runs/<run_id>/memory.md; failed attempts are not.AutoR already has a full example run used throughout the repository: runs/20260330_101222.
That run produced:
Highlighted outcomes from that run:
AGSNv2 reached 36.21 ± 1.08 on ActorAutoR is designed for terminal-first execution, but the interaction layer is not limited to raw logs and plain prompts. The current UI supports banner-style startup, colored stage panels, parsed Claude event streams, wrapped markdown summaries, and a menu-driven approval loop suitable for demos and recordings.
The example run is interesting not because the AI was left alone, but because the human intervened at critical moments:
That is the intended shape of AutoR: AI handles execution load; humans steer the research when direction actually matters.
PATH for real runspython main.py
python main.py --goal "Your research goal here"
python main.py --fake-operator --goal "Smoke test"
python main.py --model sonnet
python main.py --model opus
python main.py --venue neurips_2025
python main.py --venue nature
python main.py --venue jmlr
If --venue is omitted, AutoR defaults to neurips_2025.
python main.py --resume-run latest
python main.py --resume-run 20260329_210252 --redo-stage 03
Valid stage identifiers include 03, 3, and 03_study_design.
AutoR uses a fixed 8-stage pipeline:
01_literature_survey02_hypothesis_generation03_study_design04_implementation05_experimentation06_analysis07_writing08_disseminationflowchart TD
A[Start or resume run] --> S1[01 Literature Survey]
S1 --> H1{Human approval}
H1 -- Refine --> S1
H1 -- Approve --> S2[02 Hypothesis Generation]
H1 -- Abort --> X[Abort]
S2 --> H2{Human approval}
H2 -- Refine --> S2
H2 -- Approve --> S3[03 Study Design]
H2 -- Abort --> X
S3 --> H3{Human approval}
H3 -- Refine --> S3
H3 -- Approve --> S4[04 Implementation]
H3 -- Abort --> X
S4 --> H4{Human approval}
H4 -- Refine --> S4
H4 -- Approve --> S5[05 Experimentation]
H4 -- Abort --> X
S5 --> H5{Human approval}
H5 -- Refine --> S5
H5 -- Approve --> S6[06 Analysis]
H5 -- Abort --> X
S6 --> H6{Human approval}
H6 -- Refine --> S6
H6 -- Approve --> S7[07 Writing]
H6 -- Abort --> X
S7 --> H7{Human approval}
H7 -- Refine --> S7
H7 -- Approve --> S8[08 Dissemination]
H7 -- Abort --> X
S8 --> H8{Human approval}
H8 -- Refine --> S8
H8 -- Approve --> Z[Run complete]
H8 -- Abort --> X
flowchart TD
A[Build prompt from template + goal + memory + optional feedback] --> B[Start or resume stage session]
B --> C[Claude writes draft stage summary]
C --> D[Validate markdown and required artifacts]
D --> E{Valid?}
E -- No --> F[Repair, normalize, or rerun current stage]
F --> A
E -- Yes --> G[Promote draft to final stage summary]
G --> H{Human choice}
H -- 1 or 2 or 3 --> I[Continue current stage conversation with AI refinement]
I --> A
H -- 4 --> J[Continue current stage conversation with custom feedback]
J --> A
H -- 5 --> K[Append approved summary to memory.md]
K --> L[Continue to next stage]
H -- 6 --> X[Abort]
1 / 2 / 3: continue the same stage conversation using one of the AI's refinement suggestions4: continue the same stage conversation with custom user feedback5: approve and contin