Editorial desk with abstract AI data visualizations and research notes

AI Learning Ramp

Durable agent execution for long-running analytics work.

Course 9 is a one-hour systems session on turning agent loops into durable, auditable workflows: state machines, queues, checkpoints, human approval, replay, and cancellation for a BigQuery copilot.

Course 9 of 24 Published June 15, 2026 Focus: durable agent execution Target: OpenAI / Anthropic interviews

System-Design Frame

Assume the BigQuery copilot now performs multi-step work that may outlive one request: clarify intent, inspect schemas, draft SQL, wait for human approval, run approved jobs, summarize results, and open follow-up tasks. The interview question is how you make that execution durable without hiding policy decisions inside model context: explicit state, queued work, checkpointed progress, replayable traces, interruptible approval steps, cancellation, and compensation for side effects.

Course 9: Durable Agent Execution

One-hour objective: design a durable execution layer for an AI analytics agent and explain how state machines, queues, checkpoints, human approval, replay, and cancellation fit together in production.

Write the failure contract.

List what users and operators should see after a crash, timeout, rate limit, human non-response, unsafe SQL plan, cancelled request, and partial BigQuery job.

Study durable agent execution in Temporal.

Focus on the practical split between the workflow, model/tool activities, retries, event history, and how an OpenAI Agents SDK loop can survive process failure.

Read LangGraph persistence as checkpointing vocabulary.

Track threads, checkpoints, state snapshots, long-term store, human-in-the-loop pauses, time travel, fault tolerance, and how to resume from known state.

Ground approval gates in containment.

Use Anthropic's containment post to separate "ask the user" from enforceable boundaries: permissions, sandboxing, confirmation UI, and attacks that target human approval.

Draw the durable execution architecture.

Sketch the state store, job queue, workflow runner, tool activities, approval interrupt, cancellation path, receipts log, and replay/debug view for a BigQuery analysis task.

Deliver the interview synthesis.

Explain why durable agents are stateful distributed systems: the model can choose steps, but the platform owns state transitions, policy gates, retries, and side effects.

Course 9 Reading List

Use three required sources: one durable OpenAI-agent recipe, one checkpointing reference, and one containment engineering post for approval-boundary realism. Keep the optional refresher for core durable-execution vocabulary only.

Required

Temporal AI Cookbook: Durable Agent With Tools - OpenAI Agents SDK

A practical Temporal recipe for wrapping an OpenAI Agents SDK loop in durable workflow execution, with model calls and tools treated as activities that can be retried, recorded, and resumed.

Read for: how to translate an agent loop into workflow state, activity boundaries, retries, and event history.

Required

LangGraph: Persistence

Official LangGraph guidance on checkpointers, threads, checkpoints, state snapshots, long-term stores, human-in-the-loop flows, time travel, and fault tolerance.

Read for: the checkpoint and resume vocabulary needed to discuss production agent state without hand-waving.

Required

Anthropic: How We Contain Claude Across Products

A recent Anthropic engineering post on containment across products, including why human approval prompts can fail when the surrounding system allows coercion, confusion, or unsafe authority transfer.

Read for: how to make approval gates operationally meaningful instead of treating the human as a magic safety layer.

Optional Refresher

Temporal: Durable Execution

A concise refresher on the durable-execution model: event history, replay, recovery after worker failure, and long-running workflow semantics.

Skim for: core replay language if workflow engines are not already fresh in memory.

Readiness Checklist

You are ready for the interview version of this topic when you can defend durable execution as a distributed-systems design, not an SDK feature flag.

Interview Drill: Agentic AI System Design

Prompt: design durable execution for a BigQuery analytics agent that may spend minutes or hours clarifying a request, inspecting metadata, waiting for approval, running approved SQL, and recovering from model or worker failures.

Sources

  1. Temporal AI Cookbook: Durable Agent With Tools - OpenAI Agents SDK
  2. LangGraph: Persistence
  3. Anthropic: How We Contain Claude Across Products
  4. Temporal: Durable Execution