- LLMs like ChatGPT & Llama are prone to hallucinate in longform generation
— Jason Weston (@jaseweston) September 21, 2023
- Our method generates short questions that check facts in the full generation. These are answered correctly more often & are used to generate an improved response.
(2/4)🧵 pic.twitter.com/03NF9pQQQT
Factored CoVe: make sure in step (3) the LLM doesn't attend to results of (1) so hallucinations aren't copied
— Jason Weston (@jaseweston) September 21, 2023
Factor+Revise: adds cross-checks between steps
Overall, CoVe gives large gains in multiple tasks.
Read the paper for more (hopefully non-hallucinated) facts!
(4/4)🧵 pic.twitter.com/60foat7b6K
But, you know, however ingenious and successful, it's a work-around. As such, it's good for interim use, but it's not a long-term solution to the problem.
No comments:
Post a Comment