A new working paper. Title above, link, abstract, contents, and introduction below.
Abstract: Large language models are the result of a rich research tradition stretching back to the 1950s. This tradition involves a network of researcher doing work in the following: classical MT and computational linguistics → symbolic/statistical/network alternatives → associative and distributed memory models → statistical MT and vector-space methods → neural MT → Transformers → LLMs.
Contents
Introduction: The tangled web of ideas resulting in LLMs 1
Google Translate, a capsule history 2
Vector semantics 5
Firth and distributional semantics 7
Contra Cowen 9
Phase shift 11
Kuhnian paradigm shift 13
Associative memory 15
Sydney Lamb and associative memory for PCs 17
Principles and Development of Natural Intelligence 18
Introduction: The tangled web of ideas resulting in LLMs
As part of my ongoing investigation of Tyler Cowen’s recent monograph, The Marginal Revolution: Rise and Decline, and the Pending AI Revolution (2026), I’ve been thinking about large language models (LLMs), which he discusses in Chapter 4, “Why Marginalism Will Dwindle, and What Will Replace It?” For the most part Cowen presents his readers with the Silicon Valley view: Just as Athena emerged fully-formed from the head of Zeus, so large language models emerged fully-formed from Silicon Valley laboratories in November of 2022. And that IS how things appeared to the public at large. After seeing AIs in science fiction movies, and reading about them for years, all of a sudden, here they are, out of nothing, on the web in the form of ChatGPT.
For all practical purposes, Cowen is a member of the public. He may have been following developments for years. Given his interest in chess I’m sure he followed that story at least since IBM’s Deep Blue beat Kasparov in 1997. And AI plays an important role in his 2013 book, Average is Over. Beyond that, he has contacts in Silicon Valley going back I don’t know how long. But this is not his intellectual field, which is centered on economics. It’s one thing to read about it, to talk with researchers and entrepreneurs, it’s something else to conduct research and publish.
I am in a different position. While my Ph.D. is in English Literature, my dissertation – “Cognitive Science and Literary Theory” – is as much about knowledge representation and computational linguistics as it is about literature. I was trained in that are by the late David G. Hays, who was a first-generation researcher in computational linguistics with the RAND Corporation in the 1950s and 1960s. For the last three years I’ve been conducting research in the behavior of LLMs and have been collaborating with Ramesh Viswanathan, and expert in machine vision and cognitive science at Goethe University Frankfurt.
THAT, broadly speaking, is my field. While I wouldn’t expect Cowen’s views to be the same as mine, I would have been happier if he had at least acknowledged that there the future of LLMs and their adequacy if a matter of controversy among the experts. He should have at least mentioned Gary Marcus, Yann LeCun, Fei-Fei Li, and Melanie Mitchell. He might even have mentioned that Ilya Sutskever, a student of Geoffrey Hinton who was on the OpenAI team that developed the GPT series, that Sutskever has abandoned the idea that pure scaling is the royal road to artificial general intelligence (AGI, whatever that is). Cowen has done none of this. To read him you’d think that the basic scientific and engineering issues have been settled and it’s full speed ahead – “To infinity and beyond,” to quote Buzz Lightyear.
I don’t know quite what I’m going to say about this in the piece I’m writing for my series on the marginalism monograph. I don’t want to recount the full history, for which this working paper can serve as an outline. At the very least I will point out that matters are by no means settled and that there is a statistical tradition with roots in 1950s linguistics and 1960s document retrieval that can serve as a tertium quid between GOFAI (good old fashioned AR) and computational linguistics on the one hand and neural-network based learning on the other.
A Kuhnian paradigm shift?
On page 14 I have a section entitled, “A Kuhnian paradigm shift, or only an invitation to one?” That’s not the original title, which was simply, “Kuhnian paradigm shift.” Why the change?
Simple. There certainly was a dramatic change in the wake of ChatGPT. But I think that change was mostly institutional, in the deployment of resources, the development of institutions, the proliferation of roles in institutions, and of training. It’s not at all clear to me that there was a Gestalt change in anyone’s conceptions about the nature of intelligence in machines, or in humans for that matter. For it is Gestalt switch, a reconfiguration of understanding, that is the hallmark of a paradigm change in Kuhn’s conception of an intellectual revolution. It is not at all clear to me that there was a widespread change comparable to going from a mentality where one sees the Morning Star and Evening Star as two different entities to a mentality where one sees them as two manifestations of a single entity, the planet Venus.
Perhaps something like that has happened here and there, but I suspect that, for the most part, everyone from the most senior researchers through the general public sees the world as composed of the same kinds of entities as processes as they saw before encountering ChatGPT, or GPT-3. Those who think we’re well on the road to AGI (artificial general intelligence) still think of AGI the way they did in, say, 2015 or 2021, as the case may be. The same is true for those who doubt that we’re on that road. All that is changed is people’s awareness of the behavior displayed by the devices we have created. Their sense of what those devices are, what they deeply and essentially are, that hasn’t changed.
Though it may be under tension. The fact is, whatever any of us believes, we don’t really know why kind of behaviors these devices will be exhibiting next year, two years after that, or in ten years. There are a lot of open questions hanging in the air. I suspect that once those questions are resolved, that is, if and when they are, then we will see genuine changes in mentality.
I regard the matter as open to discovery and investigation.
Beyond all that, well, you can read through the rest of this document, which records a dialog I had with ChatGPT (May 29, 2026) beyond the asterisks. I begin by asking ChatGPT to review the history of Google Translate. Why? Because language is the through line. The computational study of language began in the 1950s with the problem of machine translation, translating a text from one natural language to another. The technique used by LLMs for capturing single-word semantics has its origins in statistical methods for document retrieval that originated in the 1960s and 1970s. Google Translate switched to neural-net technology. A year later the transformer was invented in a Google lab. The transformer, as you know, is the engine used to create current large language models. Google Translate is a natural starting point.
Well, he's been sitting on his toadstool and assuming that the creatures crawling at his feet are the sum past and present of activity. Snark of a blindspot! Economics runs out of the world while the world hums along its abundance of vision.
ReplyDelete