Most RAG systems start with a familiar shape: embed the query, retrieve the top results, pass them into a model, and generate an answer.
That shape is useful, but it is not enough for agentic work. Multi-step questions, ambiguous follow-ups, source-specific constraints, and weak evidence all break the assumption that the nearest chunks are automatically the right context.
The boundary we care about
Cortagent's Agentic RAG keeps four stages separate:
| Stage | Purpose | Failure if collapsed |
|---|---|---|
| Retrieval | Find candidate context | Irrelevant context enters the answer path |
| Evidence selection | Decide what is usable | Weak chunks become treated as proof |
| Reasoning | Work over selected evidence | The model fills gaps from prior knowledge |
| Answer synthesis | Produce a grounded response | The answer cannot be traced |
Engineering position
Retrieval is not a pre-step. It is part of the reasoning loop and has to remain inspectable.
Why top-k is incomplete
Top-k retrieval answers one narrow question: which chunks are closest under the configured scoring method?
It does not answer:
- whether the query is simple or complex,
- whether a follow-up needs conversation state,
- whether lexical matching matters more than semantic similarity,
- whether the evidence is strong enough,
- or whether the system should stop and ask for clarification.
What Agentic RAG changes
Agentic RAG adds control points around retrieval. The system can inspect the query, choose retrieval paths, select evidence, and preserve traceability before synthesis.
That does not make every answer correct. It gives the system places to fail explicitly instead of hiding weak retrieval inside fluent text.



