Here's the GitHub repository: https://t.co/z99otRustc
— Carlos E. Perez (@IntuitMachine) November 14, 2023
Abstract of paper linked in the GitHub site:
Yuling Gu and Bhavana Dalvi Mishra and Peter Clark, Do language models have coherent mental models of everyday things?, arXiv:2212.10029v3 [cs.CL].
When people think of everyday things like an egg, they typically have a mental image associated with it. This allows them to correctly judge, for example, that “the yolk surrounds the shell” is a false statement. Do language models similarly have a coherent picture of such everyday things? To investigate this, we propose a benchmark dataset consisting of 100 every- day things, their parts, and the relationships between these parts, expressed as 11,720 “X relation Y?” true/false questions. Using these questions as probes, we observe that state-of- the-art pretrained language models (LMs) like GPT-3 and Macaw have fragments of knowledge about these everyday things, but do not have fully coherent “parts mental models” (54- 59% accurate, 19-43% conditional constraint violation). We propose an extension where we add a constraint satisfaction layer on top of the LM’s raw predictions to apply common-sense constraints. As well as removing inconsistencies, we find that this also significantly improves accuracy (by 16-20%), suggesting how the incoherence of the LM’s pictures of everyday things can be significantly reduced.
No comments:
Post a Comment