Do language models have an internal world model? A sense of time? At multiple spatiotemporal scales?
— Wes Gurnee (@wesg52) October 4, 2023
In a new paper with @tegmark we provide evidence that they do by finding a literal map of the world inside the activations of Llama-2! pic.twitter.com/3kZmf3fa6q
For temporal representations, we run the models on the names of famous figures from the past 3000 years, the names of songs, movies and books from 1950 onward, and NYT headlines from the 2010s and train lin probes to predict the year of death, release date, and publication date. pic.twitter.com/KAajV03VKk
— Wes Gurnee (@wesg52) October 4, 2023
There are more tweets in the thread.
The paper, Language Models Represent Space and Time:
Abstract: The capabilities of large language models (LLMs) have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a coherent model of the data generating process -- a world model. We find evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 family of models. We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual ``space neurons'' and ``time neurons'' that reliably encode spatial and temporal coordinates. Our analysis demonstrates that modern LLMs acquire structured knowledge about fundamental dimensions such as space and time, supporting the view that they learn not merely superficial statistics, but literal world models.
Note, however:
Self organizing maps (SOM) have been doing that since like forever.
— Chomba Bupe (@ChombaBupe) October 4, 2023
Researchers just found SOM-like representations in layers of a trained deep neural net & jumping to premature conclusions that there must therefore exist a world model in those systems.
No https://t.co/OfHA0sLnxO pic.twitter.com/CJyvWiXBe1
This.
ReplyDelete