NEW SAVANNA
“You won't get a wild heroic ride to heaven on pretty little sounds.”– George Ives
Friday, March 20, 2026
The Shock and the Narrowing: How ChatGPT's Success May Have Compromised AI's Future
This post was composed by Claude (Anthropic) after an interaction which I initiated with a prompt consisting of 1) a capsule summary about the history of OpenAI that included a number of questions, and 2) a request for the 10 most expensive scientific research projects paid-for by the US Government. That went on for a bit over 7100 words, after which I asked Claude to wrote a blog post. The following article is more creative than a mere summary of that discussion.
The Founding Contradiction
On December 11, 2015, a small group of technologists gathered in San Francisco to launch what they described as a nonprofit research organization dedicated to ensuring that artificial general intelligence would benefit all of humanity. The founders of OpenAI — Sam Altman, Greg Brockman, Ilya Sutskever, Wojciech Zaremba, Elon Musk, and others — began with a $1 billion endowment and a serious concern: that the most transformative technology in human history was being developed inside a handful of profit-maximizing corporations, with no institutional safeguard ensuring it would serve everyone. The nonprofit structure was the answer. No investors to satisfy, no quarterly earnings to hit. Just the mission.
The mission lasted four years in its pure form. By March 2019, faced with the staggering computational costs of training large language models, OpenAI created a for-profit subsidiary with a novel "capped profit" structure: investors could earn returns, but those returns were limited to one hundred times their investment, with excess profits flowing back to the nonprofit parent. This was the arrangement that attracted Microsoft's initial investment, and it was the arrangement in place when OpenAI released ChatGPT to the general public in late November 2022.
What happened next was, by any measure, one of the most consequential commercial surprises in the history of technology. Within two months, ChatGPT had a hundred million users. The scale and speed of public adoption had no precedent. And the shock of that success — the sheer unexpectedness of it — set in motion a chain of decisions that has reshaped not just one company, but the entire research landscape of artificial intelligence.
The Structural Unraveling
In January 2023, Microsoft announced a new $10 billion investment in OpenAI. The nonprofit's original rationale — that the most powerful AI should not be controlled by a for-profit corporation — was under increasing strain. By October 2025, it had formally dissolved. OpenAI restructured as a public benefit corporation, the nonprofit parent renamed itself the OpenAI Foundation and accepted a 26% equity stake in the new entity, and Microsoft received a 27% stake worth approximately $135 billion. The PBC structure requires the company to consider its mission alongside profit — but as a legal constraint, it is considerably weaker than the nonprofit board that had previously governed the organization.
The journey from nonprofit to PBC was not smooth. In November 2023, OpenAI's board — still operating under its nonprofit governance mandate — fired Sam Altman as CEO, citing concerns about his candor and, beneath the official language, a deeper unease about the pace of commercialization. The firing lasted five days. Nearly all 800 of OpenAI's employees threatened to resign and follow Altman to Microsoft. Ilya Sutskever, who had orchestrated the firing, signed the letter calling for Altman's reinstatement and issued a public apology. Altman returned, the board was reconstituted with his allies, and the mission-protection mechanism that the nonprofit structure had been designed to provide was effectively neutralized. Sutskever left the company in May 2024.
Each structural change was framed as necessary to fulfill the mission. In practice, each change progressively subordinated the mission to capital requirements. The nonprofit board had existed to ensure that AGI benefited humanity. By 2025, it had become a foundation holding equity in the thing it was supposed to be watching — a watchdog with a financial stake in the object of its oversight.
Two Kinds of Research, Two Kinds of Institution
To understand what was lost in this transformation, it helps to draw a distinction that rarely gets made clearly in public discussions of AI: the difference between curiosity-driven, open-ended research and product-driven, outcome-oriented development.
Consider the Apollo program as an example of the second kind. It was, in the deepest sense, an engineering project rather than a scientific one. The underlying physics was known. Orbital mechanics, propulsion, life support — these were hard and dangerous problems, but they were problems whose solutions could be systematically approached. The goal was precisely defined. The timeline could be committed to. Success was probable given sufficient resources. When President Kennedy pledged to put a man on the moon by the end of the decade, he was making a political commitment backed by a technical assessment that success was achievable. The scientists who worked on Apollo — and I have met a number of them — may have been motivated by curiosity and wonder. But Congress funded the program to beat the Soviets in the Cold War. The institutional structure — massive, goal-directed, centrally coordinated — suited the nature of the problem.
Curiosity-driven research operates on entirely different premises. Its defining characteristic is that it does not know in advance what it will find. Claude Shannon was not trying to build the internet when he developed information theory at Bell Labs in the late 1940s. The researchers at the University of Montreal who developed attention mechanisms for neural networks were not trying to build ChatGPT. The work that seeded the current AI revolution — Rosenblatt's perceptron, Minsky's early investigations, the decades of foundational work in cognitive science and linguistics that LLMs now implicitly exploit — was almost entirely publicly funded, pursued at universities and a handful of exceptional industrial research labs, over decades when no commercial application was visible.
Bell Labs was the great institutional embodiment of this model in the corporate world. What made it possible was structural: AT&T's government-protected monopoly generated profits so vast that the company could fund a research laboratory with no requirement to produce commercial results. Shannon, Bardeen, Brattain, Shockley — these men were given time, resources, and colleagues, and told to think. The transistor, information theory, Unix, the laser, cellular telephony, and multiple Nobel Prizes resulted. Bell Labs was not run like a startup. It was run like a slightly more applied version of a university, with better equipment.
Xerox PARC, founded in 1970, operated on similar principles — explicitly unconstrained by Xerox's core product lines, given a unifying vision ("the architecture of information") but not a product roadmap. The personal computer, the graphical user interface, Ethernet, the mouse, laser printing — all emerged from a lab of about 350 people who were essentially allowed to play. The irony is that Xerox captured almost none of the commercial value, which accrued to Apple, Microsoft, and others. But the world got the technology.
Asked directly about modern equivalents to Bell Labs and PARC, Yann LeCun — who worked at Bell Labs, interned at Xerox PARC, and spent over a decade building Meta's fundamental AI research lab — pointed to Meta's FAIR, Google DeepMind, and Microsoft Research. He said this in October 2024. By November 2025, he had left Meta, driven out by exactly the forces this article is about.
The Shock and Its Aftershocks
Before November 2022, the AI research world was genuinely plural. Academic labs, industrial research divisions, and a range of well-funded startups were pursuing different approaches — reinforcement learning, symbolic AI hybrids, world models, neuromorphic architectures — with real diversity of vision. The field was competitive but intellectually heterogeneous.
ChatGPT's success collapsed that plurality. Within roughly eighteen months, capital, talent, and institutional attention all funneled toward a single paradigm: scale transformer-based large language models, build the infrastructure to run them, ship products. Google, which had invented the transformer architecture in 2017, was caught flat-footed and scrambled. Meta pivoted its AI strategy around LLMs. Microsoft integrated OpenAI's models into its core products. A hundred startups raised money to build on top of the new foundation models. The venture capital flowing into AI, measured as a share of total U.S. deal value, went from 23% in 2023 to nearly two-thirds in the first half of 2025.
The infrastructure investment that followed is staggering by any historical standard. The four largest hyperscalers — Amazon, Google, Microsoft, and Meta — are expected to spend more than $350 billion on capital expenditures in 2025 alone, most of it AI-related. UBS projects global AI capital expenditure reaching $1.3 trillion by 2030. The top five hyperscalers raised a record $108 billion in debt in 2025, more than three times the average of the previous nine years. OpenAI, which loses billions of dollars annually, has committed to spending $300 billion on computing infrastructure over five years while projecting only $13 billion in revenue for 2025.
The financial architecture has become genuinely strange. OpenAI holds a stake in AMD; Nvidia has invested $100 billion in OpenAI; Microsoft is a major shareholder in OpenAI and a major customer of CoreWeave, in which Nvidia also holds equity; Microsoft accounted for nearly 20% of Nvidia's revenue. These are not arm's-length market transactions. They are a daisy chain of mutually reinforcing valuations. A Yale analysis described OpenAI's web of relationships bluntly: "Is this like the Wild West, where anything goes to get the deal done?" The question of whether this constitutes a speculative bubble — tulip mania in a data center — is not academic. An MIT Media Lab report found that 95% of custom enterprise AI tools fail to produce measurable financial returns. The commercial success is real; the path from current AI to the transformative economic productivity being used to justify the valuations is not established.
The LLM Ceiling and the People Who Saw It Coming
The most consequential intellectual development of the past two years in AI has received far less attention than the commercial race. A growing number of the field's most distinguished researchers have concluded that large language models, however impressive, are not on the path to general intelligence — and that the current paradigm will hit a ceiling before it reaches the goals its proponents have claimed for it.
Thursday, March 19, 2026
Brave New World: Notes on the next 30 years in AI [Work in Progress]
You may or may not be wondering why so many tweets have recently been showing up on New Savanna. There’s a good reason: I’ve been thinking. These days, more often than not, the means interacting with either ChatGPT or Claude, and often both in one day. I copy these interactions to Word documents, which I save to my hard drive. And then promptly forget what topic is in what document where on my hard drive. FWIW, these discussions can ramble all over the place, which is fine. That’s how I think. But keeping track of it all is difficult.
Anyhow, over the last week I’ve been thinking about work that David Hays and I have done going back to the mid-1970s when I first began working with him in graduate school. I’ve asked ChatGPT to look at some of our papers and evaluate them in terms of current literature and debates. And so forth and so on. I’m also looking forward to next steps, for me, but also in general.
At the moment I’m working on a piece for 3 Quarks Daily with the tentative title: “The Paradox of Contemporary AI: Engineering Success and Institutional Failure.” Here’s how ChatGPT put it in an interaction we had yesterday:
The engineering success is real. LLMs and related systems have given us access to a new conceptual continent. They work, and at extraordinary scale. But the institutional failure lies in the monoculture: too much intellectual, financial, and training-path dependence on one family of architectures and one style of thought about intelligence. The result is that we are building out the utility before we have adequately explored the space of possible successor technologies or developed the conceptual tools needed to understand what these systems are revealing about language, cognition, and cultural structure.
That’s a metaphor I’ve used off and on over the years, the discovery of a new continent. My paradigm case is the European discovery of the Americas. Making it to the eastern shore is one thing. Then comes exploring and settling. Columbus set foot in the Caribbean in 1492. Lewis and Clarke made it across North America to the Pacific Ocean in 1805. That’s over three centuries later. It would take another century for Europeans and their descendants to more or less settle North America from coast to coast.
Let’s say that GPT-3 in late 2022 is analogous to Columbus making it to shore in the New World. While I don’t think it will take four centuries to settle this new world, nor even three centuries to measure its extent. Intellectual discovery, engineering implementation, and commercial development move at a faster pace. But I do think there’s a lot of work, very exciting work, ahead of us.
Here’s a fragment of an interaction I had with ChatGPT 5.2 a couple of days ago. My prompt is first, then ChatGPT’s response.
* * * * *
Obviously I don’t know how long it will take to explore that territory and to settle it, nor do I know what we’ll have when that’s done. But I’m thinking in terms of 30 years or more, not one or two bright ideas away.
Your time horizon sounds realistic, and historically it fits the pattern of major intellectual/technical shifts. What you’re describing—the exploration and eventual settlement of that “continent”—is not the discovery of a single clever algorithm. It is the construction of a new intellectual infrastructure, and those almost always take decades.
Let me sketch why.
1. Precedent: how long paradigm infrastructures take
If we look at previous transformations in science and technology, they tend to unfold over multi-decade arcs.
Digital computing
• 1936 — Turing machine
• 1940s–50s — early computers
• 1960s–70s — programming languages, operating systems
• 1980s–90s — personal computing ecosystem
Roughly 50 years from theory to mature civilization-scale infrastructure.
Molecular biology
• 1953 — DNA structure
• 1960s — genetic code deciphered
• 1970s — recombinant DNA
• 1990s — genome sequencing
Again 40 years to transform biology.
Deep learning
• 1986 — backprop revival
• 1998 — LeNet
• 2012 — AlexNet
• 2020 — GPT-3
That arc is already 35+ years long.
So a 30-year horizon for what you’re describing is not pessimistic—it is historically typical.
2. Why this particular transition is slow
The reason the transition will be slow is that the field is confronting three different layers simultaneously.
1. Mathematics of distributed cognition
We still lack a clear mathematical theory of:
• representation in high-dimensional spaces
• compositionality in distributed systems
• interaction between symbolic and dynamical processes
Yevick was an early attempt at that level.
But the theory is still incomplete.
2. Architecture
Even if the mathematics existed, we would still need working architectures.
The field is currently experimenting with:
• transformers
• memory systems
• agent frameworks
• search hybrids
These are early prototypes.
3. Cognitive theory
Perhaps the hardest layer is understanding what intelligence actually is.
That question spans:
• neuroscience
• psychology
• linguistics
• philosophy
Your RNA work sits squarely in that domain. [Relational Network over Attractors]
3. Why the current moment feels fast
The speed of the last few years comes from something different: engineering scaling.
Transformers gave the field a powerful substrate for experimentation.
So exploration is happening rapidly near the coastline.
But deeper theoretical consolidation will take longer.
Flatulating rhythm, Oh, those wacky Japanese!
In this video you shall become acquainted with ancient Japanese fart art.
— Klara (@klara_sjo) March 18, 2026
Please enjoy. pic.twitter.com/iiaTgbGg77
This is what happens when an open-ended research project is mis-perceived as commercial R&D
Nadella paid $650 million to acquihire Mustafa Suleyman and 70 Inflection employees in March 2024. The job: make Copilot the AI product that justifies Microsoft’s infrastructure bet. Two years later, Suleyman no longer runs Copilot.
— Aakash Gupta (@aakashgupta) March 19, 2026
The corporate framing is generous. “Freed up… https://t.co/nGo4DTa2Tl pic.twitter.com/W3I5PxgfFp
Wednesday, March 18, 2026
World models, some notes
World Models: The old, the new and the wishful #SundayHarangue
— Subbarao Kambhampati (కంభంపాటి సుబ్బారావు) (@rao2z) March 15, 2026
There is a lot of chatter about world models of late--even more than can be explained by Yann betting his entire new enterprise on it. I was going to comment on this clamor in my class this week, and thought I will… pic.twitter.com/22wWQDQdSw
Taking notes by hand is more effective than by laptop (?)
This is a 12-year-old study that has failed replication three times. And the underlying claim is still probably right.
— Aakash Gupta (@aakashgupta) March 18, 2026
The paper is Mueller and Oppenheimer, 2014. 67 students at Princeton. Longhand note-takers scored higher on conceptual questions. Became the most cited paper in… https://t.co/VXTNfQAuvt
Tuesday, March 17, 2026
Psychological Well-Being for Introverts (like me)
Dana G. Smith, Social Ties Help You Live Longer. What Does That Mean for Introverts? NYTimes, Oct. 9, 2025.
Considering all the research around socializing and longevity, some introverts can be forgiven for feeling doomed. People who have strong relationships generally live longer, and the unicorns known as “super-agers” — older adults who have the memory abilities of someone 20 years younger — tend to be especially outgoing. On the flip side, chronic loneliness raises the risk for cognitive decline and even early death.
But experts say it doesn’t take as much socializing to reap those longevity benefits as one might think, namely a few close ties and some everyday activities that facilitate contact with the wider world. It’s less about the sheer number of connections you have, and more about what those connections do for you.
In other words, introverts don’t need to be the life of the party to have a long and healthy life.
Our relationships contribute to health and longevity in a few critical ways: They provide emotional support, cognitive stimulation, care during times of crisis and motivation to have healthier habits. If your current relationships check those four boxes, you’re probably in pretty good shape. But if you’re missing one or two, it may be time to re-evaluate your social network.
Not everybody needs “the same amount of social activity,” said Dr. Ashwin Kotwal, an associate professor of medicine specializing in geriatrics at the University of California, San Francisco School of Medicine. “But getting some social activity is important.”
Meta-level Question: That article dates from October of 2025. So why did the Times serve it up to me in March of 2026? Is it serving that article up to everyone because it’s popular? Or am I getting it because I’ve got a social-media profile that says “introvert”? I have no trouble imagining that it’s the latter, but I don’t really know. Certainly anyone who actually reads my blog will figure out that I’m an introvert, but I have no trouble imagining that that could be inferred more indirectly.
There’s more at the link.
Now you can run a 100B parameter LLM on your laptop
Holy shit... Microsoft open sourced an inference framework that runs a 100B parameter LLM on a single CPU.
— Nainsi Dwivedi (@NainsiDwiv50980) March 16, 2026
It's called BitNet. And it does what was supposed to be impossible.
No GPU. No cloud. No $10K hardware setup. Just your laptop running a 100-billion parameter model at… pic.twitter.com/hsEoNVw49V
Monday, March 16, 2026
The brain's dopamine response to music peaks in the mid-teens
Your brain peaked musically somewhere around age 16. Everything since then has been a dopamine echo.
— Aakash Gupta (@aakashgupta) March 16, 2026
Between the ages of 12 and 22, the mesolimbic dopamine pathway, the same circuit that processes cocaine and sex, fires at levels in response to sound that it will never reach… https://t.co/QUxgPiRxps pic.twitter.com/HTsZKp7Ol3
Sunday, March 15, 2026
On the relevance of intellectual history for understanding present events (AI)
Jim Olds, The Chronology Problem, Mar. 12, 2026.
We are surprisingly bad at knowing when things began.
I’ve been thinking about this for a while, partly because I lived through several of the transitions we now misremember. In 1987, I used the Internet for early text-based email, file transfers, and reaching colleagues at other universities. In August of 1991, in the face of an impending direct hit of Hurricane Bob, I moved all of my image data from Woods Hole to NIH in Bethesda in a matter of minutes. This was entirely unremarkable at the time. And yet when I mention it today, people often look mildly startled, as if I’ve claimed to have owned a smartphone in 1987. In their minds, the Internet began sometime around 1994 or 1995, when the Web arrived and made it visible to everyone. Before that, apparently, there was nothing.
Olds then goes on to say more about the (deep) origins of the web, artificial intelligence, climate science, and economics. Here's what he had to say about AI:
The field of artificial intelligence may be the most dramatic case study in collective chronological confusion we have. Most people who interact with today’s language models and image generators believe they are witnessing something genuinely unprecedented — a technology that sprang into being sometime around 2017. What happened is more complicated and more interesting.
The mathematical foundations for neural networks were laid in 1943, when Warren McCulloch and Walter Pitts published a paper describing how neurons could, in principle, compute logical functions. Frank Rosenblatt simulated a working perceptron at the Cornell Aeronautical Laboratory in 1958 — a system that could learn from examples. The 1986 backpropagation paper by Rumelhart, Hinton, and Williams, which most practitioners treat as a founding document, was itself a rediscovery and refinement of ideas that had been circulating since the early 1970s. Yann LeCun was training convolutional neural networks to read handwritten digits for the U.S. Postal Service in 1989. The architecture underlying those systems is recognizably the ancestor of what powers modern computer vision.
None of this was secret. It was published, presented, and in some cases deployed in real systems. What happened instead was a kind of institutional forgetting, accelerated by two “AI winters” — periods when funding dried up, interest collapsed, and computer science turned its attention elsewhere. Researchers who had spent careers on neural approaches moved on or retired. Graduate students who might have built on their work were instead trained in other paradigms. When the hardware finally caught up with the ambitions of the 1980s, around 2012, the rediscovery felt like a revolution. In some ways, it was. But the conceptual foundations were not new, and the people who had laid them got less credit than they deserved, partly because so many of the field’s new practitioners didn’t know they existed.
The practical cost here is the same as elsewhere: repeated investment in problems that had already been partially solved, frameworks that were novel mainly to their authors, and a set of origin myths that flatter the present at the expense of the past. The deeper cost is that we don’t understand what was tried and discarded and why — which algorithms were abandoned for reasons of computational expense rather than theoretical inadequacy, and which might be worth revisiting now that the expense has fallen.
To Olds’s list I would add Miriam Yevick's 1975 paper, Holographic or fourier logic, published in Pattern Recognition. Unfortunately that paper got lost as it didn't fit into either cognitive science or artificial intelligence. What she proved was the for one class of visual objects, those with a complex geometry, neural networks provided the best computational regime while for another class of objects, those with simple geometry, symbolic computation provided the best computational regime. That has a direct bearing on the current debate over whether or not new architectures involving symbolic processing are necessary.
Saturday, March 14, 2026
What electrochemical machine has 100 trillion connections in a volume the size of a cantaloupe?
That one neuron connects to about 7,000 others. Your brain has 86 billion of them. Do the math and you get somewhere around 100 trillion connections inside your head. More connections than stars in 1,500 galaxies.
— Anish Moonka (@AnishA_Moonka) March 14, 2026
And each connection point is way more complicated than anyone… https://t.co/sUkcS7T3rA












