Editorial desk with abstract AI data visualizations and research notes

AI Learning Ramp

Text-to-SQL systems for governed analytics, repair loops, and query safety.

Course 6 is a one-hour systems session on turning a natural-language analytics question into permission-filtered schema context, validated SQL, and a defensible answer without treating query generation as a single model call.

Course 6 of 24 Published June 8, 2026 Focus: text-to-SQL systems Target: OpenAI / Anthropic interviews

System-Design Frame

Assume a BigQuery analytics copilot receives a question like "which enterprise accounts expanded after onboarding?" Your design must decide what schema, metric definitions, joins, table samples, permission rules, and prior tool evidence enter context before the model drafts SQL. The production system should generate a query plan, validate and dry-run the SQL, repair bounded failures, execute only under the user's effective permissions, and summarize the result with enough provenance for a data owner to trust or reject it.

Course 6: Text-To-SQL Systems

One-hour objective: defend a governed text-to-SQL architecture that connects user intent to schema grounding, semantic constraints, validation, dry-run repair, execution evaluation, and permission-safe BigQuery access.

Frame the failure modes.

Name the risks: wrong table, wrong join grain, stale metric definition, dialect mismatch, expensive scan, unauthorized column, and confident natural-language summary over bad SQL.

Read BigQuery's Gemini SQL workflow.

Focus on how natural-language prompts use table context, how generated SQL can be refined or compared, and why generated output still needs validation before use.

Study Spider 2.0's enterprise gap.

Extract what makes real systems hard: huge schemas, cloud warehouses, dialect differences, metadata search, project docs, multi-query workflows, and long context.

Review OpenAI's SQL-generation eval pattern.

Anchor on tests that parse and execute generated SQL, then translate that idea into a BigQuery eval harness with dry runs, fixtures, and expected result checks.

Refresh dry runs only if needed.

Use the optional source to reset on BigQuery dry-run validation, byte estimates, and why a dry run is a gate, not proof of business correctness.

Deliver the interview synthesis.

Walk through the request path from intent interpretation to grounded SQL, validation, bounded repair, execution, answer generation, telemetry, and release evals.

Course 6 Reading List

Keep this to three required sources. The optional refresher is for BigQuery dry-run mechanics if that API detail is not fresh.

Required

Google Cloud BigQuery: Write Queries with Gemini Assistance

Primary documentation for BigQuery's natural-language SQL generation workflow, including generated-query refinement, table-source context, comments-to-SQL, and the need to validate generated output.

Read for: product grounding, user workflow, table context selection, and where validation must sit outside the model.

Required

Spider 2.0: Real-World Enterprise Text-to-SQL Workflows

A benchmark paper showing why enterprise text-to-SQL needs metadata search, dialect docs, long-context reasoning, multi-query workflows, and complex warehouse environments such as BigQuery and Snowflake.

Read for: the system-design gap between demo SQL generation and production enterprise analytics agents.

Required

OpenAI Cookbook: How to Evaluate LLMs for SQL Generation

A practical evaluation example that combines natural-language-to-SQL generation with parseable outputs and executable tests, giving a template for regression gates beyond prompt inspection.

Read for: converting text-to-SQL quality into automated checks that can catch syntax and execution failures.

Optional refresher

Google Cloud BigQuery: Dry Runs

A short reset on using BigQuery dry runs to validate query structure and estimate processing before a generated query is allowed to execute.

Use only if: dry-run validation, byte estimates, and cost gates are not already automatic in your mental model.

Readiness Checklist

You are ready for the interview version of this topic when you can describe the system without saying "the model writes SQL and we run it."

Interview Drill: AI Infra System Design

Prompt: design a governed text-to-SQL service for a BigQuery analytics assistant used by enterprise customers with thousands of tables, strict data permissions, and high expectations for answer correctness.

Sources

  1. Google Cloud BigQuery: Write Queries with Gemini Assistance
  2. Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
  3. OpenAI Cookbook: How to Evaluate LLMs for SQL Generation
  4. Google Cloud BigQuery: Dry Runs