Hallucination Control for an LLM-Powered Citation Generator
The model was generating citations to sources that didn't exist. The legal team was not amused.
A legal-adjacent platform that had built an LLM-powered tool for generating cited summaries of research material. The capability worked when the citations were correct. It worked badly when the model invented citations to sources that didn't exist — which was happening often enough to constitute an existential problem for a product whose value proposition depended on accuracy.
I.Problem Statement
The leadership had two options: solve the hallucination problem at the engineering level or kill the product. Killing it would have wasted the development investment and given up a strategic position the team had spent significant time building. Solving it required a structural approach to grounding rather than incremental tuning of the existing implementation.
II.Methodology
A grounding architecture that made hallucinated citations structurally difficult rather than depending on the model to avoid them.
The retrieval layer was rebuilt against an authoritative source corpus. The model was no longer asked to recall citations from training; it was asked to produce summaries grounded in retrieved sources, with the citation derived from the source identifier rather than generated by the model. A citation that didn't correspond to a retrieved source couldn't be produced because the citation field was structured rather than free-text.
The prompt was rewritten to require explicit attribution. Each claim in the generated summary had to be tied to a specific retrieved source by source ID. Claims without retrievable support were instructed to be omitted rather than included with weakened phrasing. The model's behavior shifted from "produce a complete summary" to "produce a summary of what the sources actually support."
A validation layer ran after generation. Each citation in the output was checked against the actual retrieved sources — did the source ID exist, did the claim attributed to it actually appear in the source content. Citations that didn't validate were stripped before the output reached the user. The validation was structural verification, not a quality assessment.
An eval suite was built specifically around hallucination behavior. A test set of queries with known correct answers and adversarial queries designed to induce hallucination was assembled. The eval ran against every change to the system; hallucination regression became measurable rather than depending on user reports.
A quality review surface was built for the team to review outputs that the validation flagged as edge cases — claims where the citation was structurally valid but the semantic match between claim and source was uncertain. The team's review became a continuous improvement signal rather than a one-time quality assurance pass.
III.Results & Discussion
Fabricated citations dropped to a rate the legal team could accept. The product reached a quality bar where the legal team's previous skepticism converted to active recommendation. The structural change — making fabrication architecturally difficult rather than asking the model not to do it — was the leverage point. Prompt engineering alone wouldn't have solved it.