This is a fragment from a longer post in my ongoing commentary on Tyler Cowen's recent monograph on the Marginal Revolution.
* * * * *
OpenAI released GPT-3 in 2020 to a limited audience of insiders, who recognized that it represented a breakthrough. This level of performance came as a surprise. No one predicted it. GPT-3 was scaled up from GPT-2, which was in turn scaled up from GPT-1, but no one was making explicit predictions about the level of performance to be achieved at each step. These were experiments: “Let’s try it and see what happens.”That’s fine. That’s a good way to make progress, to try things out and see what happens. But don’t mistake a lucky trial for genuine knowledge.
Cowen mentioned GPT-3 on Marginal Revolution on July 19, and then published a Bloomberg column on it on July 21, which he excerpted in Marginal Revolution the next day: “...think of GPT-3 as giving computers a facility with words that they have had with numbers for a long time, and with images since about 2012.” I published a working paper in August, GPT-3: Waterloo or Rubicon? Here be Dragons, which I’ll discuss a bit later.
Two and a half years later, in November of 2022, OpenAI released ChatGPT to the general public. It spread like wildfire. Now the proverbial everyone witnessed what only a small group had witnessed in the summer of 2020. The machine speaks. Sorta’. But more convincingly than any machine had spoken before and in a way that had unimaginable implications for the future.
A threshold HAS been crossed, but it is not, so far as I can see, a threshold in understanding. It is a threshold in performance along a continuous line of scientific understanding and engineering design and construction, something I have documented in some detail in a recent working paper, The Origins of LLMs. As far as I can tell, there has been no paradigm shift, in Thomas Kuhn’s sense, no rank shift, in terms of cognitive rank theory. There were no fundamentally new ideas in the world by, say, late July of 2020 as a consequence consolidating GPT-3 and making it available in limited release.
“What about the scaling hypothesis,” you might ask. “Isn’t that new?” Ilya Sutskever first explored the idea in 2014. And Rich Sutton’s famous 2019 essay, The Bitter Lesson, generated broad discussion and OpenAI published a paper in 2020 that cemented matters, “Scaling Laws for Neural Language Models.” Given the nature of computing, scaling up is not trivial. Hundreds if not thousands of technical details need to worked out as the size of the training corpus increases by factors of 10 or more, time after time, and as more and more GPUs are ganged together to assemble the computing power needed. The scaling hypothesis gave researchers reason to expect improved performance with scaling, but there has been no gain in fundamental understanding, not of machine learning, artificial neural nets, and certainly not about language and cognition. Consequently our sense of possibility has expanded enormously. But our knowledge and deep understanding has remained the same, and the scaling hypothesis made it easy to believe that that was just fine.
Consequently our sense of possibility has expanded enormously, while our knowledge and deep understanding has remained the same. And that is what has allowed the field to be captured by businessmen, executives and venture capitalists, who have little understanding of or interest in the underlying conceptual issues. Scaling is something they understand.
Hype the dramatically increased performance and collect the cash. Purchase and deploy more resources now, reap far greater profits in a decade. Everything else is noise and friction.
But what if [the Dread] Gary Marcus and other critics are right. What if scaling LLMs is not adequate. What happens to all those investments then?
A stalemate in development? Well, it is easy to see that communities are fighting hard to keep data centers from being built in their neighborhoods. That's something tangible the oligarchs have to deal with.
ReplyDeleteGiven the amount of money that's being invested, if these companies don't turn a profit we could be facing a financial crisis comparable to the 2008 crisis.
DeleteYes, I keep seeing the prediction of a crash. And this time, the government isn't capable of turning around from a crisis.
DeleteTyler's marginalism?... "the marginal cost of serving the N+1‑st user is close to zero."...
ReplyDeleteAI Era Maginalism; "We are looking, instead, at a capital‑intensive, energy‑intensive, bandwidth‑intensive, human nursemaiding-intensive industry in which the marginal cost of the N+1‑st user is stubbornly positive."
Or "In a 2019 essay, Sutton proposed the "bitter lesson", ... "70 years of AI research [had shown] that general methods that leverage computation are ultimately the most effective, and by a large margin", beating efforts building on human knowledge about specific fields like computer vision, speech recognition, chess or Go."
https://en.wikipedia.org/wiki/Richard_S._Sutton
What does Tyler bot say about these three differing margin types?
"Inference Is Unlikely to Ever Be a Low Marginal Cost Operational Node, & the Other Reasons Why the Anthropic and OpenAI IPOs Ought to Fail
"Digital Gods, real costs: why a rational world would see the doom of the foundation‑model-builder IPO, because the AI labs are highly unlikely to ever get profits, let alone hyperprofits…
Brad DeLong
Jun 02, 2026
...
"Inference is not and will never be sufficiently cheap tells us that we are not kooking at a familiar software story, one in which you do a big up‑front engineering push and then harvest enormous quasi‑rents because the marginal cost of serving the N+1‑st user is close to zero. We are looking, instead, at a capital‑intensive, energy‑intensive, bandwidth‑intensive, human nursemaiding-intensive industry in which the marginal cost of the N+1‑st user is stubbornly positive....
"Hence the importance of the religio-theological faith that a Digital God can be built and that the Oracular pronouncements of that Digital God can then be sold for real money as a motivator for what we have seen to date. ... But the action on the actual token-production frontier is where Paolo says it is: systems that are bounded, supervised, observably logged, and tightly leashed to trustworthy data via RAG‑like architectures, and that still need a lot of plumbing and babysitting from quite expensive software engineers. ...
"Enterprises discover that, at the margin, this is not a replacement for their staff but an add‑on that itself has to be managed, monitored, and audited. That is a useful product. It is not a license to mint hyperprofits. Not for them. Not for me with SubTuringBradBot
"At the same time, any price umbrella that might have preserved some margin is being kicked away from below. ..."
...
https://braddelong.substack.com/p/inference-is-unlikely-to-ever-be
And where does the margin go when magical training data is insisted upon in the "model".
"LLMs are closer to religion than they appear. Watch out for those who like it that way
"Papal's 40k-word encyclical drops and lawyers already asking if Catholics can refuse workplace AI on religious grounds"
Rupert Goodwins Register columnist
Mon 1 Jun 2026
...
"The next step will be for this report to be used to create arguments to force religious training data on LLMs to ‘ensure fairness’ and ‘counterbalance liberal bias. You may not want AI used as a proselytizing pipeline into home and office. Others do.
"In many ways, AI is a religion. Not because it requires belief in a utopian future through a dystopian present, or that it’s used by very powerful people to get more power, or that nobody can define what it actually is. All these things are overlaps on the Venn diagram, but the biggie is that LLMs rely on internal universes derived yet decoupled from reality. Religions that deify their interpretation of their scriptures instinctively know this model and how to use it.
...
https://www.theregister.com/ai-ml/2026/06/01/llms-are-closer-to-religion-than-they-appear-watch-out-for-those-who-like-it-that-way/5248189
Tyler & Brad
https://marginalrevolution.com/marginalrevolution/2023/02/my-conversation-with-brad-delong.html
SD
Cory Doctorow and marginalism, intended vs no intention, and acting like a god.
ReplyDelete"The tedious power of storytelling
...
"As a species, we've been through this before. Think back to those sunsets. There was a time when we all thought of sunsets as being explicitly created by another being, who was in communication with us through the natural environment (some people still believe this).
"Looking at a sunset was an exercise in asking yourself, "If I were God, what would I be trying to say to me with this sunset?" just as looking at one of my photos of a sunset would be an exercise in asking yourself, "If I were Cory, what would I be trying to say to me with this photo of a sunset?"
"The rise of materialism and scientific rationalism is sometimes called a "disenchantment" and indeed, there's a sense in which a sunset that we know to have no intender is no longer "enchanted." The experience of a sunset becomes something like, "Those colors and their interplay with the physical world is very beautiful." It might even be, "How could I capture that beauty in a painting or a photo or a description so that I could communicate it to someone else?" But it's not, "I wonder what God wants me to feel when I look at this sunset?"
...
https://pluralistic.net/2026/06/02/must-we-pretend/#everything-you-dont-have-to-do
SD