AI handles execution, humans own the direction, and every run becomes an inspectable research artifact on disk.
# Add to your Claude Code skills
git clone https://github.com/AutoX-AI-Labs/AutoRLast scanned: 5/3/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-03T06:26:48.979Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": true
}AutoR is not a chat demo, not a generic agent framework, and not a markdown-only research toy.
It is a structured research harness over a coding agent execution layer: AI handles execution, humans own the direction, and every run becomes an inspectable research artifact on disk.
New users should start with the step-by-step guides: English Guide or 中文教程.
Most autoresearch systems optimize for autonomy.
AutoR takes a different position: research is too important to hand over as a blind end-to-end loop. The goal is not to remove humans from research. The goal is to give them a stronger execution system.
| Dimension | AutoR |
| --- | --- |
| Execution model | A coding agent as the execution layer, AutoR as the research control loop |
| Control model | Human approval by default, with an optional strict reviewer-agent gate for unattended runs |
| Research unit | A reproducible run under runs/<run_id>/ |
| Workflow shape | 9-stage workflow: optional intake plus eight formal research stages |
| Quality bar | Artifact-backed outputs, not markdown-only summaries |
| Recovery | Resume, redo-stage, rollback-stage, stage-local continuation |
| Layer | Highlight | What AutoR actually does | | --- | --- | --- | | Big idea | Human-centered research execution | AutoR is not an autonomous scientist. AI handles execution; humans retain approval and direction at every stage boundary. | | Big idea | Research loop over agent loop | The system manages stage progression, validation, repair, recovery, and human checkpoints above the lower-level agent execution loop. | | Big idea | Every run is a reproducible research artifact | Each run leaves behind prompts, logs, approved summaries, code, data, figures, writing sources, and packaged outputs under . | | Big idea | | The workflow is judged by inspectable artifacts and human approval, not by whether a generated document merely looks polished. | | Useful feature | | Survey notes, bibliographies, related-work tables, and reading artifacts stay under instead of disappearing into chat history. | | Useful feature | | Machine-readable experiment and result files make runs inspectable, comparable, and reusable downstream. | | Useful feature | | Writing expects citation verification, build logs, and self-review artifacts before Stage 07 is considered complete. | | Useful feature | | and related manifests help later stages find data, results, and figures without guessing from filenames. | | Useful feature | | Long research runs can continue in place, retry a stage, or roll downstream state back without starting over. | | Useful feature | | AutoR can package manuscript sources, PDFs, review materials, and release-ready artifacts instead of stopping at markdown summaries. |
No comments yet. Be the first to share your thoughts!
runs/<run_id>/workspace/literature/artifact_index.jsonIn practice, that means AutoR is useful not only because of the high-level framing, but also because it handles real research chores: literature organization, experiment manifests, citation verification, artifact indexing, manuscript packaging, and recoverable long-running workflows.
Many systems aim to generate research outputs that look ready.
AutoR takes a harder path:
So the question is not:
Does it look ready?
It is:
Can you verify every part of it?
Latest mainline updates:
--sandbox workspace-write execution flag instead of the deprecated Codex CLI --full-auto flag.--full-auto approval mode. The execution loop is unchanged, but the manual approval gate can now be replaced by a strict simulated reviewer agent backed by Claude or Codex, with reviewer settings persisted in run_config.json.Decision Ledger section and validates draft outputs against the correct .tmp.md path. Added stage recovery controls that let operators /skip the current stage, /back <stage> to an earlier stage, or choose skip / roll back directly after retry exhaustion.--operator codex support alongside Claude, persisted the selected execution backend in run_config.json, and improved terminal rendering for backend JSON streams.--research-diagram dependencies and tightened the README positioning around human-centered, artifact-backed research execution.AutoR already has a full example run used throughout the repository: runs/20260330_101222.
| What the run produced | What it demonstrates | | --- | --- | | example_paper.pdf | A compiled manuscript artifact within a broader research package | | Executable research code | The run is not just a writing pipeline | | Machine-readable datasets and result files | Claims are backed by inspectable experiment outputs | | Real figures used in the research package | The run produces publication-style visuals, not placeholders | | Review and dissemination materials | The workflow continues past writing into release readiness |
Highlighted outcomes from that run:
AGSNv2 reached 36.21 ± 1.08 on Actor.AutoR is designed for terminal-first execution, but the interaction layer is not limited to raw logs and plain prompts. The current UI supports banner-style startup, colored stage panels, parsed backend event streams, display-width-aware markdown wrapping, keyboard-selectable menus, and a Stage 00 clarification flow suitable for demos and recordings.