agentic-engineering-handbook

Name: agentic-engineering-handbook
Author: keyuchen21

Pending

The definitive OpenAI, Claude, MCP, Harness, Evals, and Production Agent Systems learning roadmap.

114stars

4forks

Python

Installation

# Add to your Claude Code skills
git clone https://github.com/keyuchen21/agentic-engineering-handbook

Getting Started

Guides for using ai agents skills like agentic-engineering-handbook.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

README.md

Frequently Asked Questions

What is agentic-engineering-handbook?

agentic-engineering-handbook is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by keyuchen21. The definitive OpenAI, Claude, MCP, Harness, Evals, and Production Agent Systems learning roadmap. It has 114 GitHub stars.

Is agentic-engineering-handbook safe to use?

agentic-engineering-handbook's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.

How do I install agentic-engineering-handbook?

Clone the repository with "git clone https://github.com/keyuchen21/agentic-engineering-handbook" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is agentic-engineering-handbook written in?

agentic-engineering-handbook is primarily written in Python. It is open-source under keyuchen21 on GitHub, so you can review or fork the full source.

Are there alternatives to agentic-engineering-handbook?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh agentic-engineering-handbook against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

Graphite agentrules-architect

Agentic Engineering Handbook

The definitive OpenAI, Anthropic, Google, MCP, Harness, Evals, and Production Agent Systems learning roadmap.

If this repository helps you, consider giving it a ⭐

Why This Repository?

The AI industry has entered the Agentic Era. Building production-grade AI systems now requires mastering agents, tool use, MCP, memory, long-running workflows, coding agents, agent harnesses, evals, and safety — but the knowledge is scattered across OpenAI blogs, Anthropic engineering posts, SDK docs, cookbooks, and research papers.

This repository consolidates 161 curated resources into one structured learning roadmap.

The goal: Become a world-class Agentic Engineer.

How To Use This Handbook

Pick the path that matches your starting point:

New to agents: follow the Learning Roadmap from Phase 0 to Phase 6. Treat each Read First, Then Read, and Build Exercise as a checklist.
Already building LLM apps: start at Phase 2 or Phase 3, then fill gaps in agent loop, tool calling, evals, and production engineering.
Trying to build projects: use the phase-level Build Exercise prompts, then branch into Applied Practice Tracks for coding agents, security, code review, or SRE.
Looking for references: jump to the Full Reading Table. Read P0 first, use P1 for implementation detail, and keep P2 as optional background.

Learning Roadmap

Phase 0 — Agent Loop From Scratch

If you treat Claude Code as a coding CLI, many capabilities can feel like magic: it reads files, runs commands, edits code, delegates work, and stays oriented during complex tasks.

From an engineering perspective, the core is much simpler:

model + tools + one loop.

Understanding that loop makes the rest of the system easier to reason about:

When the agent should plan first, and when it should act immediately
Why an explicit todo list reduces drift in longer tasks
Why subagents improve exploration while protecting the main context
How skills, MCP, and hooks each add capability around the same core loop

These pages are based on the upstream English Markdown tutorials from shareAI-lab/mini-claude-code, with added Study Notes and inline source code for this handbook.

Step	Page	Code
v0	Bash is All You Need	v0_bash_agent.py
v1	Model as Agent	v1_basic_agent.py
v2	Structured Planning	v2_todo_agent.py
v3	Subagent Mechanism	v3_subagent.py
v4	Skills Mechanism	v4_skills_agent.py

Supporting files are included in the same folder: requirements.txt, .env.example, v0_bash_agent_mini.py, and skills/.

Phase 1 — Agent Foundations

Build shared vocabulary for workflow vs agent, tool loop, handoff, guardrails.

Key Mental Models

Should I build an agent? (4-question checklist from Barry Zhang's talk)

Question	If No → Workflow	If Yes → Agent
Is the task complex enough?	Decision tree is fully mappable	Ambiguous problem space
Is the task valuable enough?	<$0.10 per run	>$1 per run, cost doesn't matter
Are all core capabilities doable?	Weak links break the chain	Model handles every step well
Is error cost low & detectable?	High cost + hard to detect → human-in-the-loop	Errors caught by tests/CI

Think like the agent. Most failures come from designing with a human perspective. Put yourself inside the agent's context window: you only see ~10K–20K tokens (system prompt + tool descriptions + recent observations). Ask: does the agent have enough information to act correctly at each step?

→ Source: How We Build Effective Agents

Read First

#	Title	Vendor
1	System Prompts	Anthropic
2	Prompt guidance	OpenAI
3	Function Calling	OpenAI
4	Tool use overview	Anthropic
5	Function calling - Gemini API	Google
6	Building effective agents	Anthropic
7	New tools for building agents	OpenAI
8	Agents SDK overview	OpenAI

Then Read

Title	Vendor
How We Build Effective Agents: Barry Zhang, Anthropic	Anthropic
Phistory — Claude Code & Codex CLI System Prompt Diff History	Community
Coding Agents 101: The Art of Actually Getting Things Done	Cognition
OpenAI Agents SDK examples	OpenAI
Structured Outputs for Multi-Agent Systems	OpenAI

Build Exercise

Build a customer service/ticket triage agent: router → specialist → evaluator, with all outputs constrained by structured schemas.

Phase 2 — MCP & Tool Ecosystem

Understand MCP server/client, remote vs local, tool loading, approval, connector boundaries.

Read First

#	Title	Vendor
1	Introducing the Model Context Protocol	Anthropic
2	MCP and Connectors	OpenAI
3	Building MCP servers for ChatGPT Apps and API integrations	OpenAI

Then Read

Title	Vendor
Code execution with MCP: Building more efficient agents	Anthropic
Writing effective tools for AI agents - with AI agents	Anthropic
Model Context Protocol - Codex	OpenAI
Build a Remote MCP server	Cloudflare
Introducing the MCP Registry	MCP
OpenAI Docs MCP	OpenAI
Build your ChatGPT UI	OpenAI

Build Exercise

Build a read-only repo/docs MCP server, then create an eval to verify the agent correctly cites documentation.

Phase 3 — Context, Memory & Skills

Learn to control context window, short/long-term memory, skills/plugins, CLAUDE.md/AGENTS.md.

Read First

#	Title	Vendor
1	Agent Skills Specification	Agent Skills
2	Effective context engineering for AI agents	Anthropic
3	How Long Contexts Fail	Drew Breunig
4	Context Rot	Chroma
5	Progressive disclosure	Claude-Mem
6	Equipping agents for the real world with Agent Skills	Anthropic
7	Agent Skills	Anthropic
8	Skills	OpenAI
9	Building Reliable Agents with Memory and Compaction	OpenAI

Then Read

Title	Vendor
Custom instructions with AGENTS.md - Codex	OpenAI
Best practices for Claude Code	Anthropic
Agent Skills - Codex	OpenAI
Skills in OpenAI API	OpenAI

Build Exercise

Implement the same task as a Skill/Plugin, then measure accuracy and token cost across three variants: no skill, long prompt, and skill-based.

Phase 4 — Harness & Long-Running Agents

Master agent runtime: event stream, thread, tool execution, state, sandbox, approval, recovery.

Read First

#	Title	Vendor
1	Unrolling the Codex agent loop