Claude Certified Architect – Foundations: Exam Prep Guide

About This Guide

This guide is compiled from a detailed Q&A thread analyzing exam-style questions for the Claude Certified Architect – Foundations certification. It breaks down multi-agent architectures, context management, tool design, and batch processing principles, focusing heavily on architectural tradeoffs and best practices over raw prompt engineering.

How the Exam Works

The Claude Architect Foundations exam tests your ability to design resilient, production-ready AI systems. It consists of scenario-based multiple-choice questions focusing on Domain 1: Agentic Architecture, Domain 4: API & Orchestration, and Domain 5: Context Management. The exam prioritizes structural, deterministic solutions (like schema design and tool boundaries) over probabilistic approaches (like prompt instructions).

Scenarios Covered

This guide covers core architectural scenarios you are likely to encounter, including:

Agentic Architectures: Subagent spawning, parallelism, and workflow decomposition.
Context Management: Mitigating "lost in the middle" effects, context bloat, and preserving provenance through summarization.
Tool & Schema Design: Enforcing business rules, standardizing outputs, and designing machine-readable identifiers for tool chaining.
Batch Processing & State: Resuming crashed sessions, optimizing SLA buffers, and handling partial batch failures.

Questions & Answers

Q1: Handling Semantic Errors in Extraction Totals

Scenario: Schema Design & Self-Correction

Question:
An automated invoice extraction pipeline occasionally outputs structured JSON where the extracted line items do not add up to the total amount extracted from the invoice. What is the best architectural approach to handle this semantic error?

Options:

A) Add a calculated_total field alongside the stated_total field, compare them, and flag mismatches for human review.
B) Automatically adjust the line item values so they mathematically sum to the stated total.
C) Introduce a secondary LLM step to reconcile the math errors.
D) Add more few-shot examples of correct math to the prompt.

✅ Correct Answer: A — Extract calculated_total and flag discrepancies

Why this is correct:
JSON schemas prevent syntax errors but not semantic errors (like bad math). The most robust self-correction pattern is extracting both what the document explicitly states (stated_total) and what the model calculates from the line items (calculated_total). If they don't match, the record is flagged for human review. This improves reliability without fabricating data.

Why the others are weaker:

B) Adjusting line items invents values not supported by the document, corrupting data integrity.
C) Adds unnecessary architectural overhead when explicit deterministic validation is available.
D) Few-shot examples cannot guarantee deterministic consistency for arithmetic.

💡 Exam Takeaway:
Design self-correction flows by extracting both a stated value and a calculated value, routing mismatches to human review rather than silently "fixing" source document errors.

Q2: Scaling Prompt Refinement vs. Batch Processing

Scenario: Batch Processing & Cost Optimization

Question:
You are preparing to process 50,000 legacy documents using the Batch API. An initial test on 500 documents reveals that 18% of them require 2-3 prompt refinements to extract data correctly. What is the most cost-efficient strategy for scaling this workload?

Options:

A) Refine the prompt interactively on a representative sample to maximize first-pass success, then process all 50,000 documents via the Batch API.
B) Use the Batch API to process all 50,000 documents immediately, identify failures at scale, and resubmit them.
C) Process the 50,000 documents using the synchronous API to handle prompt refinement dynamically per document.
D) Begin submitting 5,000-document batches to incrementally learn failure modes in production.

✅ Correct Answer: A — Refine interactively first, batch process later

Why this is correct:
The guide explicitly recommends refining prompts on a representative sample before batch-processing large volumes. Iterative resubmissions at scale destroy the cost savings of the Batch API. By fixing the failure modes on a localized sample, you ensure the prompt is robust, maximizing first-pass success when you finally trigger the 50,000-document Batch run.

Why the others are weaker:

B) Paying for prompt learning on production-sized batches results in massive, expensive resubmission cycles.
C) Synchronous API throws away the 50% cost savings of the Batch API.
D) Still pays for iterative failure learning at a high scale.

💡 Exam Takeaway:
Refine prompts on a representative sample first to maximize first-pass success, then run the full volume through the Batch API to minimize expensive resubmissions.

Q3: Batch API Scheduling for SLAs

Scenario: SLA Management & Async APIs

Question:
Your system processes asynchronous user requests with a strict Service Level Agreement (SLA) requiring results within 30 hours of submission. You plan to use the Message Batches API, which can take up to 24 hours to complete. Which batch submission schedule best meets the SLA while maximizing cost efficiency?

Options:

A) Submit batches every 6 hours.
B) Submit one large batch at the end of each day.
C) Use the synchronous API instead to guarantee the SLA.
D) Submit batches every 4 hours.

✅ Correct Answer: D — Submit batches every 4 hours

Why this is correct:
Worst-case total turnaround time is the time a document waits for the next batch window plus the maximum batch processing time (24 hours). If you submit every 4 hours, a document arriving right after a cutoff waits 4 hours + 24 hours processing = 28 hours. This leaves a 2-hour safety buffer under the 30-hour SLA.

Why the others are weaker:

A) 6 hours wait + 24 hours processing = exactly 30 hours. This leaves zero buffer, risking SLA breaches.
B) End-of-day (24h) wait + 24h processing = 48 hours, failing the SLA entirely.
C) Unnecessarily sacrifices the Batch API's 50% cost discount.

💡 Exam Takeaway:
Calculate worst-case SLA by adding batch frequency interval to the 24-hour max processing time; always leave an operational buffer.

Q4: Schema Design for Conflicting Source Truths

Scenario: Schema Engineering & Provenance

Question:
An extraction pipeline processes technical manuals. A specific manual lists two conflicting battery capacities: one in the text and a different one in a detailed specs table. Historical data shows the specs table is correct 90% of the time. How should the extraction schema handle this?

Options:

A) Halt processing and flag the document for manual correction before extraction.
B) Use a single-value schema and prompt the model to pick the most likely correct value.
C) Hard-code a rule to always extract the value from the specs table.
D) Change the field to capture all conflicting values along with their source locations to preserve provenance for downstream reconciliation.

✅ Correct Answer: D — Capture all values with source locations

Why this is correct:
Forcing premature collapse into a single value is an anti-pattern. If the specs table is only right 90% of the time, hard-coding a preference guarantees a 10% error rate. Modifying the schema to accept an array of values with explicit source locations preserves full provenance, allowing downstream business logic or human reviewers to make an informed reconciliation.

Why the others are weaker:

A) Impractical for automated pipelines operating on legacy documents.
B) A single-value schema forces the LLM to destroy evidence of the conflict.
C) A heuristic rule is brittle and will output wrong data 10% of the time.

💡 Exam Takeaway:
Preserve conflicting source data in the structured output instead of forcing premature collapse into a single value.

Q5: Output Formats for Tool Chaining

Scenario: Tool Interfaces & Identifiers

Question:
An agent uses a search_documents tool to find files, and subsequently uses share_document(document_id, email) and move_document(document_id, folder) to act on them. How should the search_documents tool format its output to ensure reliable chaining?

Options:

A) Return clickable human-readable URLs.
B) Return structured data containing document_id and metadata for each result.
C) Return detailed prose summaries of the document contents.
D) Return a simple list of document titles.

✅ Correct Answer: B — Structured data with document IDs

Why this is correct:
Multi-step workflows require clear input/output contracts. Because the downstream tools (share, move) require a specific machine-usable identifier (document_id), the upstream search tool must return exactly that ID in a structured format alongside the human-readable metadata.

Why the others are weaker:

A) Clickable URLs are for humans. The agent would have to infer or parse IDs from the string.
C) Prose summaries do not provide the exact programmatic identifiers needed for the next API call.
D) Titles are ambiguous and cannot be passed directly into an ID-based API.

💡 Exam Takeaway:
Multi-step tool workflows require machine-usable identifiers; tools meant for chaining should always return structured data containing explicit IDs.

Q6: The Primary Advantage of Structured Output

Scenario: Agentic Tool Integration

**Question

Claude Certified Architect – Foundations: Exam Prep Guide

About This Guide

How the Exam Works

Scenarios Covered

This guide covers core architectural scenarios you are likely to encounter, including:

Agentic Architectures: Subagent spawning, parallelism, and workflow decomposition.
Context Management: Mitigating "lost in the middle" effects, context bloat, and preserving provenance through summarization.
Tool & Schema Design: Enforcing business rules, standardizing outputs, and designing machine-readable identifiers for tool chaining.
Batch Processing & State: Resuming crashed sessions, optimizing SLA buffers, and handling partial batch failures.

Questions & Answers

Q1: Handling Semantic Errors in Extraction Totals

Scenario: Schema Design & Self-Correction

Options:

A) Add a calculated_total field alongside the stated_total field, compare them, and flag mismatches for human review.
B) Automatically adjust the line item values so they mathematically sum to the stated total.
C) Introduce a secondary LLM step to reconcile the math errors.
D) Add more few-shot examples of correct math to the prompt.

✅ Correct Answer: A — Extract calculated_total and flag discrepancies

Why the others are weaker:

B) Adjusting line items invents values not supported by the document, corrupting data integrity.
C) Adds unnecessary architectural overhead when explicit deterministic validation is available.
D) Few-shot examples cannot guarantee deterministic consistency for arithmetic.

💡 Exam Takeaway:
Design self-correction flows by extracting both a stated value and a calculated value, routing mismatches to human review rather than silently "fixing" source document errors.

Q2: Scaling Prompt Refinement vs. Batch Processing

Scenario: Batch Processing & Cost Optimization

Options:

A) Refine the prompt interactively on a representative sample to maximize first-pass success, then process all 50,000 documents via the Batch API.
B) Use the Batch API to process all 50,000 documents immediately, identify failures at scale, and resubmit them.
C) Process the 50,000 documents using the synchronous API to handle prompt refinement dynamically per document.
D) Begin submitting 5,000-document batches to incrementally learn failure modes in production.

✅ Correct Answer: A — Refine interactively first, batch process later

Why the others are weaker:

B) Paying for prompt learning on production-sized batches results in massive, expensive resubmission cycles.
C) Synchronous API throws away the 50% cost savings of the Batch API.
D) Still pays for iterative failure learning at a high scale.

💡 Exam Takeaway:
Refine prompts on a representative sample first to maximize first-pass success, then run the full volume through the Batch API to minimize expensive resubmissions.

Q3: Batch API Scheduling for SLAs

Scenario: SLA Management & Async APIs

Options:

A) Submit batches every 6 hours.
B) Submit one large batch at the end of each day.
C) Use the synchronous API instead to guarantee the SLA.
D) Submit batches every 4 hours.

✅ Correct Answer: D — Submit batches every 4 hours

Why the others are weaker:

A) 6 hours wait + 24 hours processing = exactly 30 hours. This leaves zero buffer, risking SLA breaches.
B) End-of-day (24h) wait + 24h processing = 48 hours, failing the SLA entirely.
C) Unnecessarily sacrifices the Batch API's 50% cost discount.

💡 Exam Takeaway:
Calculate worst-case SLA by adding batch frequency interval to the 24-hour max processing time; always leave an operational buffer.

Q4: Schema Design for Conflicting Source Truths

Scenario: Schema Engineering & Provenance

Options:

A) Halt processing and flag the document for manual correction before extraction.
B) Use a single-value schema and prompt the model to pick the most likely correct value.
C) Hard-code a rule to always extract the value from the specs table.
D) Change the field to capture all conflicting values along with their source locations to preserve provenance for downstream reconciliation.

✅ Correct Answer: D — Capture all values with source locations

Why the others are weaker:

A) Impractical for automated pipelines operating on legacy documents.
B) A single-value schema forces the LLM to destroy evidence of the conflict.
C) A heuristic rule is brittle and will output wrong data 10% of the time.

💡 Exam Takeaway:
Preserve conflicting source data in the structured output instead of forcing premature collapse into a single value.

Q5: Output Formats for Tool Chaining

Scenario: Tool Interfaces & Identifiers

Options:

A) Return clickable human-readable URLs.
B) Return structured data containing document_id and metadata for each result.
C) Return detailed prose summaries of the document contents.
D) Return a simple list of document titles.

✅ Correct Answer: B — Structured data with document IDs

Why the others are weaker:

A) Clickable URLs are for humans. The agent would have to infer or parse IDs from the string.
C) Prose summaries do not provide the exact programmatic identifiers needed for the next API call.
D) Titles are ambiguous and cannot be passed directly into an ID-based API.

💡 Exam Takeaway:
Multi-step tool workflows require machine-usable identifiers; tools meant for chaining should always return structured data containing explicit IDs.

Q6: The Primary Advantage of Structured Output

Scenario: Agentic Tool Integration

**Question

claude-architect-exam-prep

Claude Certified Architect – Foundations: Exam Prep Guide

About This Guide

How the Exam Works

Scenarios Covered

Questions & Answers

Q1: Handling Semantic Errors in Extraction Totals

Q2: Scaling Prompt Refinement vs. Batch Processing

Q3: Batch API Scheduling for SLAs

Q4: Schema Design for Conflicting Source Truths

Q5: Output Formats for Tool Chaining

Q6: The Primary Advantage of Structured Output

Related Skills

claude-architect-exam-prep

Claude Certified Architect – Foundations: Exam Prep Guide

About This Guide

How the Exam Works

Scenarios Covered

Questions & Answers

Q1: Handling Semantic Errors in Extraction Totals

Q2: Scaling Prompt Refinement vs. Batch Processing

Q3: Batch API Scheduling for SLAs

Q4: Schema Design for Conflicting Source Truths

Q5: Output Formats for Tool Chaining

Q6: The Primary Advantage of Structured Output

Related Skills