by oxbshw
One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.
# Add to your Claude Code skills
git clone https://github.com/oxbshw/LLM-Agents-Ecosystem-HandbookGuides for using ai agents skills like LLM-Agents-Ecosystem-Handbook.
Last scanned: 5/15/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-15T06:58:03.333Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": false
}No comments yet. Be the first to share your thoughts!
A practical operating manual for building, evaluating, securing, and shipping modern LLM agent systems.
Modern agents are not "a prompt + a tool." They are systems — with identity, memory, skills, tools, MCP integrations, guardrails, observability, evals, and a provider strategy. This handbook teaches the whole stack and ships templates, blueprints, runnable adapters, and curated examples you can adopt today.
A curated, opinionated, production-oriented handbook in seven parts:
DESIGN.md machine-readable spec| You are… | Start at | |---|---| | New to agents | docs/beginners_guide.md → agent_os/README.md | | Building a production agent | blueprints/ → checklists/production_readiness_checklist.md | | Picking / wiring providers | providers/README.md → providers/provider_matrix.md | | Comparing frameworks | docs/framework_comparison.md | | Adding memory / RAG | memory/ → tutorials/rag_tutorials | | Adding MCP | mcp/ → mcp/mcp_security.md | | Designing Skills | skills/ → skills/skill_design_guide.md | | Working with coding agents | coding_agents/ → coding_agents/prompts/ | | Writing better prompts | prompt_engineering/ | | Designing & rolling out | design_docs/ | | Hardening safety/evals | safety/ → evals/ | | Coding agent reading this repo | llms.txt → llm_wiki/index.md |
| Layer | Purpose | Where in this repo | |---|---|---| | Model / Provider | LLM choice + abstraction + routing | providers/ | | Orchestration | Agent loops, planning, handoffs | docs/framework_comparison.md, blueprints/ | | Tool | Function calling and external actions | agent_os/mcp_layer.md | | MCP | Standardized external context and tools | mcp/ | | Memory | Durable user/project/semantic memory | memory/ | | Skills | Reusable, progressive-loading workflows | skills/ | | Identity | Personality, mission, refusal style | agent_os/agent_identity.md, templates/ | | Prompt | System prompt design, instruction hierarchy, defenses | prompt_engineering/ | | Safety | Guardrails, approvals, policy | safety/ | | Observability | Tracing, spans, cost, latency, evals | observability/, evals/ | | Deployment | Shipping agents to production | design_docs/rollout_plan.md | | Coding-agent harness | Claude Code, Cursor, Codex, Aider, Cline | coding_agents/ |
📖 Deep dive: agent_os/README.md
The handbook ships an LLMProvider abstraction with 24+ providers across six families. Most providers go through a single OpenAI-compatible code path; specialty / local providers are first-class.
| Provider type | Examples | Best for | |---|---|---| | Frontier APIs | OpenAI, Anthropic, Google Gemini | Reasoning, tool use, production agents | | Fast inference | Groq, Cerebras, SambaNova | Low-latency workloads | | Marketplaces | OpenRouter, Together, Fireworks, DeepInfra | Model choice and routing | | Enterprise clouds | Azure OpenAI, AWS Bedrock, Vertex AI | Compliance, governance | | Specialty | xAI, Perplexity, Mistral, Cohere, DeepSeek, Hugging Face, Replicate, NVIDIA NIM, MiniMax | Domain-specific | | Local runtimes | Ollama, LM Studio, vLLM, llama.cpp | Privacy, cost control, offline dev |
Quick start:
from utilities import get_provider
from utilities.provider_router import ProviderRouter
# Use any single provider
out = get_provider("groq").chat(
[{"role": "user", "content": "Summarize MCP."}],
model="llama-3.1-8b-instant",
)
# Or route by task class with fallback
router = ProviderRouter()
out = router.chat(messages, task_class="cheap") # Groq → DeepSeek → Together → OpenRouter
📖 providers/README.md • providers/provider_matrix.md • providers/router_patterns.md • providers/local_models.md
.
├── README.md • llms.txt • llms-full.txt
├── agent_os/ ← the Agent OS concept, layers, workspace examples
├── providers/ ← 24+ provider docs + adapters + router patterns
├── templates/ ← AGENTS.md / SOUL.md / MEMORY.md / SKILL.md / DESIGN_DOC / ADR / …
├── skills/ ← design guide + taxonomy + maturity model + curated catalog + 4 examples
├── memory/ ← memory taxonomy, distillation, security, examples
├── mcp/ ← MCP basics, architecture, security, server catalog, examples
├── prompt_engineering/ ← agent prompt patterns, instruction hierarchy, defenses
├── coding_agents/ ← Claude Code, Cursor, Codex, workflows, prompts, review
├── design_docs/ ← agent + technical design docs, ADR guide, design.md spec
├── safety/ ← guardrails, approvals, prompt injection, secure checklist
├── observability/ ← tracing, spans, cost/latency, dashboards
├── evals/ ← eval design, regression / tool / memory / MCP / safety / prompt
├── blueprints/ ← production architectures by use case
├── examples/ ← end-to-end runnable agent workspaces
├── checklists/ ← agent design, prod readiness, MCP security, …
├── llm_wiki/ ← LLM-friendly index, glossary, matrices, wiki pattern
├── docs/ ← framework comparison, best practices, beginners' guide
├── tutorials/ ← RAG, memory, fine-tuning, chat-with-X
├── utilities/ ← LLMProvider + router + provider_config
├── agents/ ← 100+ curated agent skeletons (preserved)
├── complete_apps/, web_apps/, notebooks/, datasets/, design/, resources/, scripts/, tests/, ecosystem/
└── .github/ ← issue / PR templates
A curated, in-repo catalog plus a clear taxonomy and maturity model:
Curated skills shipped: research-summarizer, repo-auditor, mcp-security-reviewer, agent-memory-curator, api-design-reviewer, pr-summarizer, adr-writer, incident-postmortem, sprint-planner, dataset-profiler.
A dedicated section, agent-focused:
Templates: [S