Follow-up questions are where many RAG systems fail.
"What about that?" can be obvious to a person and ambiguous to a retrieval system. The answer depends on previous turns, referenced entities, source order, and sometimes a decision the system already made.
What state must preserve
Agentic RAG needs to preserve:
- active entities,
- prior evidence,
- unresolved references,
- source context,
- and the current mode of the conversation.
Boundary
Conversation memory can help interpret the query. It should not silently become proof.
What the system should not do
It should not re-answer from scratch when context exists. It should not use old context blindly. It should not guess a referent when the conversation does not establish one.
Those three rules create the real trade-off: reuse state when grounded, ask when unclear, and avoid recomputation when the prior reasoning is still valid.
Why this is Agentic RAG
The retrieval step changes based on conversation state. A follow-up may need less retrieval, different retrieval, or a clarification question. Treating every turn as a brand-new prompt loses that control.


