Unstructured Retrieval: Policy Documents (RAG)
Context Windows and Unstructured Data
While relational databases are optimized for structured tabular data, company documentation - such as policy manuals, product guides, and contract terms - exists as unstructured text. Directly inserting a large policy manual into a language model's prompt is impractical. Language models have finite context windows, and processing tens of thousands of tokens on every request increases latency and API costs.
To resolve this constraint, agents use Retrieval-Augmented Generation (RAG). RAG is an architecture that searches an external document store for relevant text segments (chunks) matching the user's query, and dynamically inserts only those segments into the model's prompt context.
The Retrieval Pipeline
Retrieval-augmented workflows follow a multi-step pipeline to isolate and expose unstructured knowledge:
- Chunking: The source documents are split into smaller, overlapping segments (e.g., 200–500 words each) to preserve contextual boundaries.
- Indexing: Each chunk is passed through an embedding model to generate a high-dimensional vector. These vectors represent the semantic meaning of the text. The vectors are then stored in a vector database.
- Querying: When a user submits a query, the application converts the query into a vector and performs a similarity search (like cosine similarity) against the database index.
- Generation: The top-K matching text chunks are extracted and inserted into the system instructions as context, enabling the model to generate a fact-based answer.
Unstructured Retrieval Architecture
The diagram below illustrates the retrieval flow for a natural language policy check:
Autonomy allows the agent to decide when to execute retrieval. Instead of retrieving documents on every request, the model evaluates if the inquiry requires external policy lookup, structures the search query, executes the tool, and uses the retrieved details to construct its response.
Interactive Playground: Semantic Policy Search
The following playground implements a local text search tool containing retail policy chunks. The agent handles a customer inquiry regarding an opened electronic item. The model decides to search the policy store, parses the matching chunks, and uses them to determine the correct refund or store credit terms.
In this run, the model recognizes that returning opened headphones involves both the opened returns policy and electronics guidelines. By executing the search, it extracts both policy details: that opened items are restricted to store credit or exchange (policy chunk 2) and that electronics are subject to a 15% restocking fee (policy chunk 3). The agent synthesizes this combined context to inform the user accurately.