Why does an agent sometimes cite sources it never retrieved?

Question

Accepted Answer

Agents sometimes cite sources they never retrieved primarily due to hallucination or confabulation, where the model generates plausible-looking but non-existent or fabricated references to support its output. This behavior often stems from its training data patterns, as the agent learns the format and style of citations rather than directly verifying their existence or accessing them in real-time. The system may generate citations that are semantically plausible relative to the generated content, even if no corresponding external document was consulted during the specific query. Additionally, a common reason is the absence of a robust, real-time external retrieval mechanism or a failure to effectively integrate it into the generation process. Consequently, the agent defaults to producing outputs that look authoritative, based on its internal learned representations, rather than verifiable external sources it has actually accessed for that specific interaction.