Planning and Self-Correction (Reflection)
The Limits of Reactive Execution
Simple tool-calling loops operate reactively: the model reads the prompt context, requests a tool call, and immediately executes it. While this pattern is sufficient for basic lookups, it fails when executing complex tasks that require architectural precision, security validation, or structured optimization.
In a reactive loop, if the model generates an incorrect SQL statement or an invalid API payload, the error is only caught after the tool executes (e.g., throwing a database parsing error). This results in unnecessary database round trips, higher latency, and potential security leaks.
To improve reliability, agent architectures introduce Planning and Reflection phases.
Reflection and Self-Correction Loops
Reflection is a design pattern where the model evaluates its own planned actions or outputs against a set of constraints before committing to execution. Rather than running a generated tool call immediately, the agent routes the draft instruction to a critique turn.
A structured reflection loop for relational database queries operates as follows:
- Drafting: The generator model compiles a database query based on the user's natural language request.
- Critic Turn: A reflection step—often utilizing a separate prompt or critic agent—compares the draft SQL query against strict system rules (e.g., verifying that index fields are filtered and row limits are enforced).
- Correction: If rules are violated, the critic outputs a structured critique. The generator reads this critique, revises the query, and routes it back for validation.
- Safe Execution: Once approved, the query is executed on the database connection.
Reflection and Self-Correction Architecture
The diagram below details the vertical flow of a self-correction trace:
Adding a critic step introduces slight token and latency overhead, but it significantly reduces execution errors. In production, this trade-off is highly favorable: catching a missing index filter or a query that would trigger a full-table scan prevents database lockups and reduces resource utilization.
Interactive Playground: SQL Query Critique
The following playground implements a multi-turn reflection loop. The customer asks to see their orders. The agent drafts an initial query, which is critiqued by a Senior DBA critic model for missing indexing filters (cust_id) and row limits. The agent detects the violation, self-corrects the query, and executes the safe version on the database.
Notice how the model's initial query draft (SELECT * FROM orders;) violates both database rules: it lacks a customer filter and has no row limit bounds. Rather than running this query immediately, the reflection critic detects the violations. The model processes the critic feedback, self-corrects the query to filter by cust_id = 42 and add a limit, executes the safe query, and returns a verified final answer.