We added a fingerprint cache path to reduce repeated work when inputs are equivalent. This is not an accuracy feature by itself; it is a latency control.
The constraint is correctness. Cache keys must preserve enough context to avoid reusing an answer or retrieval result where the surrounding conversation changed the meaning of the request.


