NEW SAVANNA
“You won't get a wild heroic ride to heaven on pretty little sounds.”– George Ives
Thursday, June 4, 2026
Does waning interest in the World Cup signal a thinning of genuine nationalist sentiment?
David Wallace-Wells, Why Does No One Care About the World Cup This Year? NYTimes, June 3, 2026.
They used to call the World Cup, unequivocally, the planet’s biggest sporting event. But it is about to start, right here in North America, and no one much seems to care. Thousands of tickets remain unsold, and just weeks ago, others were being resold well below their official price. [...] And I actually do think this might be telling us something, beyond the world of sports, about the global landscape of politics and culture.
In the States, the indifference might not be surprising, even though the event is being played mostly on U.S. soil. The U.S. team is more talented than in the past but hasn’t looked impressive for years. Soccer is still a growth sport rather than a dominant one in this country, and many Americans aren’t exactly feeling the flush of simplistic patriotism these days. On top of which, the tickets have been priced punishingly high.
What is more striking to me is the muted interest of the rest of the world, which every four years for decades seemed almost to pause for a month to engage in a truly global but appealingly low-stakes performance of tribal nationalism. [...]
What makes this shift so striking is that it has happened alongside a rising tide of political nationalism around the world, which you might think would produce a great surge in soccer nationalism, too. Instead, the age of global populism has coincided with intense interest in the biggest club teams — for-hire rosters assembled largely from international talent by megacorporations boasting jersey sponsorships from foreign conglomerates. [...] But no one could even pretend to illustrate the age of global populism by talking about the intensity of popular feeling about national teams.
That’s from the beginning of the article. After this, that, and the other, Wallace-Wells concludes:
Namely, that what we identify as nationalism in global affairs might be better described as a form of parochialism, with populists making particular claims not about the nation per se so much as the ways it should be reformed — presumably toward some reactionary ideal, its contours often more local than genuinely national. In this reading, globalization hasn’t just generated a backlash among those who resent deindustrialization, capital flight and the stateless lives of the world’s billionaires. It has also made the nation itself seem like a somewhat untrustworthy unit of political and social organization to many people on the right. For them, what might once have served as a source of patriotism and pride now produces feelings of resentment and regret. Not that liberals aren’t queasy about nationalism these days, either. For all of us, rooting for Arsenal or P.S.G. might now be more appealing precisely because it’s essentially meaningless.
Ed Zitron says AI is a losing bet – "doing exit liquidity for venture capital"
To go out publicly and say what he is saying, @edzitron has balls the size of the Las Vegas sphere pic.twitter.com/oHjv3LCvXh
— JustDario (@DarioCpx) June 4, 2026
Wednesday, June 3, 2026
Correcting Cowen’s misleading presentation of large language models [MR #10]
Surprise! There’s been a change of plans. The last time I’d posted about Cowen’s monograph on marginalism I figured I had one more (longish) blog post, one about the fourth and final chapter, “Why Marginalism Will Dwindle, and What Will Replace It?” But the more I thought about it, the longer and more convoluted it got. So I’ve decided to simplify things by writing three posts, each substantial, but focused, instead of a long rambling affair like the one I did on biology. So, I ‘m writing one post about large language models (this post), which Cowen brings up at the end of the chapter. Then I’m writing one about high dimensional models in economics, which Cowen introduces early in the chapter. My final post will be a general response to Cowen’s ideas about where this is all headed.
In this post I want to do three things: 1) First I’ll talk about the surprise nature of the success achieved by GPT-3 and then ChatGPT. 2) Then I will present three passages from Cowen’s text and comment on them. 3) Finally, I want to give a brief rundown of tradition of statistical work that stands behind LLMs.
Surprise!
OpenAI released GPT-3 in 2020 to a limited audience of insiders, who recognized that it represented a breakthrough. This level of performance came as a surprise. No one predicted it. GPT-3 was scaled up from GPT-2, which was in turn scaled up from GPT-1, but no one was making explicit predictions about the level of performance to be achieved at each step. These were experiments: “Let’s try it and see what happens.” That’s fine. That’s a good way to make progress, to try things out and see what happens. But don’t mistake a lucky trial for genuine knowledge.
Cowen mentioned GPT-3 on Marginal Revolution on July 19, and then published a Bloomberg column on it on July 21, which he excerpted in Marginal Revolution the next day: “...think of GPT-3 as giving computers a facility with words that they have had with numbers for a long time, and with images since about 2012.” I published a working paper in August, GPT-3: Waterloo or Rubicon? Here be Dragons, in which I both acknowledged about the breakthrough and cautioned about becoming too satisfied with the technology that occasioned the breakthrough.
Two and a half years later, in November of 2022, OpenAI released ChatGPT to the general public. It spread like wildfire. Now the proverbial everyone witnessed what only a small group had witnessed in the summer of 2020. The machine speaks. Sorta’. But more convincingly than any machine had spoken before and in a way that had unimaginable implications for the future.
A threshold HAS been crossed, but it is not, so far as I can see, a threshold in our understanding, either of AI or anything else. It is a threshold in performance along a continuous line of scientific understanding and engineering design and construction, something I have documented in some detail in a recent working paper, The Origins of LLMs. As far as I can tell, there has been no paradigm shift, in Thomas Kuhn’s sense, no rank shift, in terms of cognitive rank theory. There were no fundamentally new ideas in the world by, say, late July of 2020 as a consequence consolidating GPT-3 and making it available in limited release.
“What about the scaling hypothesis,” you might ask. “Isn’t that new?” Ilya Sutskever first explored the idea in 2014. Rich Sutton’s famous 2019 essay, The Bitter Lesson, generated broad discussion about the issue. Then OpenAI published a paper in 2020 that cemented matters, “Scaling Laws for Neural Language Models.”
Given the nature of computing, scaling up is not trivial. Hundreds if not thousands of technical details need to be worked out as the size of the training corpus increases by factors of 10 or more, time after time, and as more and more GPUs are ganged together to assemble the computing power needed. The scaling hypothesis gave researchers a reason to expect improved performance with scaling, but without having to make fundamental breakthroughs in understanding, not of machine learning, artificial neural nets, and certainly not about language and cognition. Consequently our sense of possibility has expanded enormously. Our knowledge and deep understanding have remained the same and the scaling hypothesis made it easy to believe that that was just fine.
Passages from Cowen’s Text
Unfortunately Cowen seems to have bought this story. Not only that, but he doesn’t even acknowledge that there is considerable current debate about whether or not LLMs will be sufficient to achieve AGI (artificial general intelligence) when they are scaled up enough. The most visible opponent of this idea is Gary Marcus, a student of Steven Pinker, who argues that we need to incorporate insights and technology from “old school” symbolic computing (sometimes known as GOFAI, good old-fashioned AI). Marcus is certainly not alone, there are many others. But I don’t want to reprise that debate. I just want to mention that it exists and that Cowen completely ignores it.
What I would like to do in this section is quote some passages from his text and comment on them.
The Marginal Revolution: Rise and Decline, and the Pending AI Revolution, pp. 106-107:
Suffice to say, LLM construction has for the most part ignored linguists and philosophers, and that also means ignoring their intuitions. LLM construction also ignored a lot of people in the AI field who insisted neural nets were a dead end. Instead, in a relatively short number of years humans invented new ways of modeling language and reasoning through language. That research program has proven wildly successful, as we have much better models of language and reasoning than almost anyone had been expecting.
That first sentence is true, sorta’. It is also misleading. As I have documented in that working paper, The Origins of LLMs, this technology is based on a continuous line of statistical thinking that extends back to the 1950s (I take a brief look at this in the next section) . It is the syntacticians, semanticists, and cognitive scientists who been ignored. The second sentence is a bit of an exaggeration. AlexNet put neural nets firmly back on the agenda in 2012.
The big problem is Cowen’s use of “model” in the last two sentences. Large language models are not causal models like those economists use. They don’t tell us anything about how language and thought work. They are algorithmic models. They are about turning input into output; just how that is done is a mystery. Until we understand the internal operations of LLMs they tell us almost nothing about language and reasoning. They give a boost to the idea that some kind of statistical process is involved, but that’s it.
This situation is deeply paradoxical. These algorithmic models perform much better than the computer models created during the “classical” era of cognitive science, the 1960s and 1970s, models that were based on linguistic theory. We knew how those models worked. We don’t know how these models work. We have purchased performance at the cost of ignorance – a formulation I have from the late Martin Kay.
AI won't unfold in society as fast as the Silicon Valley pundits think it will [Tyler Cowen]
From YouTube:
Economist and author Tyler Cowen delivers a provocative keynote on how AI will reshape growth, work, status, and geopolitics. Mixing clear‑eyed realism with long‑run optimism, he argues that AI is both our “plan A” for avoiding fiscal crisis and a technology that will leave many people disoriented—and some high‑status winners of the old world worse off.
What’s in this video:
—Why AI will radically change jobs and status without causing mass unemployment
—Two big new job categories: running experiments and gathering data for AI
—The “human bottlenecks” that limit AI’s impact to ~2% → ~2.5% growth
—How AI could be “plan A” for stabilizing public debt and avoiding fiscal crisis
—Who gains and who loses: global poor and initiative‑takers vs. elite professionalsCowen’s message: AI’s benefits are enormous—higher growth, longer lives, more opportunity for the poor—but they come with psychological, political, and institutional friction. If you work in or care about AI, you’re not just building products; you’re helping write the only credible plan for a sustainable and prosperous future.
Recorded live at Sana AI Summit 2026, New York, May 21st, 2026.
Tuesday, June 2, 2026
Mathematicians are concerned that exploitation by the AI industry threatens the long-term intellectual interests of the field
Siobhan Roberts, As A.I. Makes Strides in Mathematics, Mathematicians Urge Caution, NYTimes, June 2, 2026.
Mathematicians issue a declaration:
On Tuesday, a group of 16 mathematicians, in consultation with colleagues and math organizations worldwide, published the Leiden Declaration on Artificial Intelligence and Mathematics. It aims to “frame the conversation about future directions,” said Dame Ursula Martin, one of the authors, and a mathematician and computer scientist at Oxford.
This effort comes as A.I. models have been making headlines with successful results in research-level mathematics. In late May, OpenAI, the maker of ChatGPT, announced that one of its models had disproved a notable 80-year-old mathematics conjecture in the field of combinatorial geometry.
The conjecture is one of some 1,200 problems posed by the Hungarian mathematician Paul Erdos. While some of these “Erdos problems” are considered throwaway questions of narrow interest, others have proved influential and field shaping. Along with a research paper describing the proof, OpenAI released a companion paper by several independent mathematicians. Jacob Tsimerman of the University of Toronto, an expert in the adjacent subfield of number theory, commented: “This is a really impressive piece of work, and I would accept it for any journal without hesitation.”
Potential problems:
Among the potential threats that the Leiden Declaration authors articulate are accuracy and reliability: Journal editors are already complaining about a flood of plausible seeming A.I.- generated papers and proofs that have turned out to be incorrect, and in ways that are difficult for mathematicians to discern.
Perhaps most pointedly, the authors raise the question of whether the many A.I. companies tackling mathematics — major players such as OpenAI, Google DeepMind and Anthropic, or start-ups such as Harmonic, Math, Inc. and Axiom Math — are keeping the field’s best interests in mind. “Technology companies’ involvement in research,” they write, “raises the risk that research questions are prioritized and incentivized because of their amenability to A.I. methods and models, rather than their deeper significance to understanding.” In turn, they point out, this disadvantages researchers who choose not to use the technology, and those who do not have access to it.
For Rodrigo Ochigame, a historian and anthropologist of computing and artificial intelligence at Leiden University in the Netherlands, and one of the statement’s authors, the latest OpenAI proof illustrates why this sort of collective reckoning in the discipline is necessary. “The story follows the same pattern as many other announcements by commercial A.I. developers,” Dr. Ochigame said. “The A.I. model is proprietary and unavailable to anyone outside the company. We get a flashy promotional video, while basic information needed to assess the scientific meaning of the result is kept secret. The company disclosed nothing about the methods, human-written prompts, training data, or computational resources consumed.”
Much of the article consists of a videoconference and email dialog with Dr. Ochigame, Dr. Martin and mathematician Michael Harris of Columbia University:
MARTIN: What OpenAI has done is throw a great deal of resources at Erdos problems, and got lucky with this one. That’s remarkable, and impressed the experts. We are not told about the model’s failures. [...]
To think of mathematics in terms of precise and neatly stated problems, like high school exams or the list of Erdos problems, is to misunderstand and diminish what makes mathematics so powerful and significant. Mathematics is not just about solving problems — it is also the cultivation of ideas, understanding, judgment, and human insight.
HARRIS The purpose, from my perspective, is to recover control of the narrative about the values and the goals of mathematics from the A.I. industry. Mathematicians are concerned that the values of the profession are being misrepresented, not intentionally but due to the media campaign on the part of the industry, which seems to want to promote the belief that they are in a position to transform mathematics — “the A.I. revolution in math,” as one headline put it not long ago. [...]
We want to affirm certain values that have characterized the profession: openness, honesty, giving credit where credit is due, sharing, transparency about methodologies, and access for independent verification of results.
An aspect of mathematics that is cherished by mathematicians is that it is one of few successful examples of a gift economy — that is to say, its economy is somehow an island of idealism in our society.
OCHIGAME Several A.I. companies are investing in dedicated teams focusing on mathematics, using problems as benchmarks and publications as training data. They are training their models to prove theorems not because they want to advance mathematical knowledge, but because they hope that such training will improve the models’ reasoning abilities more generally. [...]
MARTIN It’s important not to lose sight of the fact that what the A.I. companies are doing, what you can achieve with this technology, is absolutely extraordinary. I don’t think we’re challenging that. We’re challenging the framing, we’re challenging the behaviors around it.
I share the concern that these mathematicians express, that the commercial exploitation of mathematics is inimical to long-term research interests.
There's more at the link.
Splash! [Media Notes 183]
I’m pretty sure that I saw Splash when it appeared in theaters in 1984 but I certainly didn’t imagine that it would popularize “Madison” as a name for girls. The Wikipedia entry notes:
According to the Social Security Administration, the name Madison was the 216th most popular name in the United States for girls in 1990, the 29th most popular name for girls in 1995, and the third most popular name for girls in 2000. In 2005, the name cracked the top 50 most popular girls' names in the United Kingdom, and articles in British newspapers credit the film for the popularization.
In the movie “Madison” is the name taken by a mermaid, played by Daryh Hannah, when she emerges on land to attach herself to a forlorn Allen Bauer, played by Tom Hanks.
The first 10, 15, 20 minutes or so of the movie are about how this situation comes about, but let’s just take that as a given. This is a story about how a human male and a female mermaid meet, fall in love, and, why not? I’ll give the ending away. They swim away to, presumably, live happily ever after, under the sea.
I’m interested in the elaborate contraption that’s constructed around them. As far as I can tell that contraption exists to conceal an interpersonal problem that’s been kicking around for a long time, one identified by Sigmund Freud (e.g. ”A Special Type of Choice of Object made by Men,” 1910), played out on stage by William Shakespeare [1], and that’s been kicking around in stories and poems since forever: Men have trouble dealing with the fact that women can be both sexual and loving, passionate and beloved. Splash deals with this by presenting us with a creature that’s both human and not human (i.e. a mermaid).
Young Allen Bauer is despondent because his girlfriend’s just moved out without giving any him any inkling that she was going to do that. He just wants a woman he can love and marry and be happy with. That’s all he wants. True Romance.
And then this woman shows up. We know she’s really mermaid, but he doesn’t. He picks her up at the police station – I know, you want to know how that came about, but it doesn’t really matter, it’s just staging – takes her home and she goes to bed with him. Simple as that. No teasing or pleading, nothing resembling courtship. Just what happened in there, we get to imagine whatever we wish. But we do see her waken in the middle of the night, fill a bathtub with salt water and then luxuriate in it, tail and all. Allen then wakes up a follows her to that bathroom. He knocks on the door, wants to see her, but she tells him to keep out. He breaks down the door and she has barely enough time to transition back to human form.
I mean, she’s really a mermaid! And she’s only got six more days, until the full moon, and then she’s got to return home. But she doesn’t tell Allen where home is or what she is. But she does take the name “Madison.”
Then things get complicated, and painful – in several senses. I was all but squirming as I watched how Madison was treated in the laboratory. Allen isn’t the only man involved. There’s a bizarre scientist, Walter Kornbluth, who gets wind of all this and realizes, “Ah hah! I’ll bet she’s really a mermaid.” He’s seen her before. Don’t ask. He manages to douse her with water, at a dinner for the President of the USA (don’t ask), and she’s taken by Kornbluth’s rival scientist, locked up in a lab, and subjected to tests. When Kornbluth learns that she’s going to be dissected the next day, he decides to spring her and return her to Allen.
By this time Allen knows that he’s slept with and is in love with a mermaid. Now what? Well, she’s got to return to the sea or she’ll die. If he’s willing to leave with her and never return to dry land, that can happen. He decides, no. She dives into the water and starts swimming away. He changes his mind, jumps in after her, and they swim away as the credits role.
And this point you may be thinking: “That’s crazy.” I know, and it’s even crazier. Read the Wikipedia plot summary (linked above), you’ll see. My point is that this elaborate contraption is a way of dealing with that problem that Freud named and analyzed, that men have this split image of women as both mothers and whores (if you will). Splash transforms that duality into humans and mermaids and erects an elaborate fantastical contraption to deal with it.
Given that you accept all that, I’ve got one problem with the movie. Allen should have stayed on the pier and let the mermaid go. That wouldn’t have given the audience the feel-good ending for which a movie like this is concocted, but it would have been a minimal way of acknowledging the preposterous nature of it all.
* * * * *
ADDENDUM [a day later]: Some things were bugging me, so I watched it again.
In the prelude, Allen, his parents, and his brother are on a boat tour off Cape Cod. His older brother Freddie is dropping coins on the deck near women so he can look up their skirts as he retrieves the coins. Allen is entranced by the water. He jumps in. Pandemonium on the boat.
He meets a young girl. They hold hands. He’s retrieved. She’s sad to see him leave. We see that she’s a mermaid.
One the one hand it adds little or nothing to the plot. But it establishes that Allen has established some kind of link with this mermaid during his preadolescent childhood and the contrast between his behavior and that of his older brother reinforces our sense of the innocence of that link. Note that this kind of childhood link occurs in a number of anime series, though I can’t recall any title names off the top of my head.
Then the movie abruptly shifts ahead to the present, where Allen is being hassled by a customer at the family’s produce business. Freddie comes wheeling in in his sports care and is ecstatic because Penthouse magazine printed his letter.
This and that, Allen’s drunk over his girlfriend leaving, so decides to go to Cape Cod, where he met that girl/mermaid when he was eight, Now we see the crazy scientist...this that that other & Allen’s bonked in the head by a rogue motorboat, he sinks...and then next we see him he’s lying on the beach. An adult mermaid is watching him. He speaks to her, she’s naked. She walks up to him and kisses him, and then she disappears into the sea. We see her swimming under water and she no longer has legs; she’s got a fishtail. And she’s spotted by that crazy scientist.
She finds Allen’s wallet, looks at the papers, swims to a wreck, finds a chart....and we’re back in Manhattan. Allan’s head is bandaged....naked mermaid at the Statue of Liberty...she’s arrested. He picks her up at the police station. He’s transfixed when he sees her. She kisses him. She’s all over him and he doesn’t (quite) know how to deal with her. He tries to leave for work. But returns and grapples with her. He leaves for work. She learns English from watching TV. Decides to go shopping.
And so forth and so on. I could go on and on with this, but this will have to do....for now.
* * * * *
[1] I analyze this dynamic in some detail in my essay, At the Edge of the Modern, or Why is Prospero Shakespeare's Greatest Creation? Journal of Social and Evolutionary Systems 21(3): 259-279, 1998, https://www.academia.edu/235334/At_the_Edge_of_the_Modern_or_Why_is_Prospero_Shakespeares_Greatest_Creation
Monday, June 1, 2026
Hollis Robbins in AI and Education
Episode page:
This week on The Hope Axis, Hollis Robbins joins me to talk about her dense career and what it actually means to build a life dedicated to the humanities today.
Hollis, a dear friend of Interintellect and one of the first hosts, is stepping back from her role as Dean of Humanities at the University of Utah to focus on writing three upcoming books, including one with the incredible title, “Do Not Go to College Unless.”
We get into her journey, and finding room for hope and leisure in the middle of it all. Hope you enjoy!
00:00 Intro
00:38 Introduction
02:50 College, dropping out, intellectual ideals and AI
08:00 The state of universities and colleges in the time of information abundance
15:39 Intellect and technology
29:38 Anti-AI and pro-intellectual discourse is missing the point
34:17 America's higher (mid) education doesn't compete with AI
43:23 Reinventing tutoring
48:42 How is Hollis Robbins surviving in the humanities?
54:37 The process of ""dumbing down"" of the American University
01:04:41 If a specialist won't be teaching you at the university - don't go there.
01:08:02 Intellectual labour in scale economy.
01:10:40 Final thoughts
Surprise! Why it was so easy for executives and VCs to hijack the AI revolution
This is a fragment from a longer post in my ongoing commentary on Tyler Cowen's recent monograph on the Marginal Revolution.
* * * * *
OpenAI released GPT-3 in 2020 to a limited audience of insiders, who recognized that it represented a breakthrough. This level of performance came as a surprise. No one predicted it. GPT-3 was scaled up from GPT-2, which was in turn scaled up from GPT-1, but no one was making explicit predictions about the level of performance to be achieved at each step. These were experiments: “Let’s try it and see what happens.”That’s fine. That’s a good way to make progress, to try things out and see what happens. But don’t mistake a lucky trial for genuine knowledge.
Cowen mentioned GPT-3 on Marginal Revolution on July 19, and then published a Bloomberg column on it on July 21, which he excerpted in Marginal Revolution the next day: “...think of GPT-3 as giving computers a facility with words that they have had with numbers for a long time, and with images since about 2012.” I published a working paper in August, GPT-3: Waterloo or Rubicon? Here be Dragons, which I’ll discuss a bit later.
Two and a half years later, in November of 2022, OpenAI released ChatGPT to the general public. It spread like wildfire. Now the proverbial everyone witnessed what only a small group had witnessed in the summer of 2020. The machine speaks. Sorta’. But more convincingly than any machine had spoken before and in a way that had unimaginable implications for the future.
A threshold HAS been crossed, but it is not, so far as I can see, a threshold in understanding. It is a threshold in performance along a continuous line of scientific understanding and engineering design and construction, something I have documented in some detail in a recent working paper, The Origins of LLMs. As far as I can tell, there has been no paradigm shift, in Thomas Kuhn’s sense, no rank shift, in terms of cognitive rank theory. There were no fundamentally new ideas in the world by, say, late July of 2020 as a consequence consolidating GPT-3 and making it available in limited release.
“What about the scaling hypothesis,” you might ask. “Isn’t that new?” Ilya Sutskever first explored the idea in 2014. And Rich Sutton’s famous 2019 essay, The Bitter Lesson, generated broad discussion and OpenAI published a paper in 2020 that cemented matters, “Scaling Laws for Neural Language Models.” Given the nature of computing, scaling up is not trivial. Hundreds if not thousands of technical details need to worked out as the size of the training corpus increases by factors of 10 or more, time after time, and as more and more GPUs are ganged together to assemble the computing power needed. The scaling hypothesis gave researchers reason to expect improved performance with scaling, but there has been no gain in fundamental understanding, not of machine learning, artificial neural nets, and certainly not about language and cognition. Consequently our sense of possibility has expanded enormously. But our knowledge and deep understanding has remained the same, and the scaling hypothesis made it easy to believe that that was just fine.
Consequently our sense of possibility has expanded enormously, while our knowledge and deep understanding has remained the same. And that is what has allowed the field to be captured by businessmen, executives and venture capitalists, who have little understanding of or interest in the underlying conceptual issues. Scaling is something they understand.
Hype the dramatically increased performance and collect the cash. Purchase and deploy more resources now, reap far greater profits in a decade. Everything else is noise and friction.
But what if [the Dread] Gary Marcus and other critics are right. What if scaling LLMs is not adequate. What happens to all those investments then?
Dædalus has a special issue dedicated to AI – AI & Science: What Is the Future of Discovery?
Here's the blurb:
Continued progress in artificial intelligence, its expanding usefulness in science, and its contributions to landmark advances suggest that we may have entered a new era of AI for science.
The breakthroughs so far—such as predicting the structure of practically every known protein, with profound implications for our understanding of biology, health, and the treatment of disease—are notable not only for what was achieved but also how it was achieved and what that suggests for scientific progress.
This special double issue of Dædalus poses the question: What is the future of scientific discovery in this new age of AI?
Thirty-three scientists responded. Bringing perspectives from life sciences and medicine, cognitive science and neuroscience, the physical and earth sciences, chemistry and materials science, computer science, mathematics and the social sciences—they draw on their work at the frontier of AI and science.
The authors write with an eye to the future, not just the present. They explore what is being achieved and what possibilities lie ahead; examine AI’s limitations and efforts to move forward; and investigate the larger implications of AI-assisted science—on how science is done, the role of the scientist, and the scientific method, as well as the challenges and complexities involved.
The authors together exemplify a long-standing bidirectional relationship: AI advancing science, while science advances AI. Where that relationship will take us—a golden age of discovery? New scientist-machine collaborations? Autonomous labs? Discoveries without human understanding?—is a future we are only beginning to imagine, and one we must also shape if the beneficial possibilities are to be realized.
Learn from your own latents and not from tokens
Daniel J. Korchinski, Alessandro Favero, Matthieu Wyart, Learn from your own latents and not from tokens: A sample-complexity theory, arXiv:2605.27734v1 [cs.LG] 26 May 2026
Abstract: Generative models, from diffusion models to large language models, achieve remarkable performance but at a cost in training data orders of magnitude larger than what biological learners require. An alternative paradigm has emerged in which networks are trained to predict their own latent representations of related views or masked regions, as in data2vec and JEPA – an idea related to predictive-coding accounts of the cortex. Despite strong empirical results, the theoretical understanding of these methods remains limited. Central questions include: by how much does latent prediction actually improve data efficiency? Is there a benefit to stacking such methods into multi-scale hierarchies? We answer both using as data a tractable probabilistic context-free grammar that captures the compositional structure of natural language and images. Such a grammar generates strings of visible tokens by recursively applying production rules along a tree of hidden symbols of depth L. For such data, supervised or token-level SSL require a number of samples exponential in L to recover the latent tree; we prove that latent prediction achieves this with a number of samples constant in L, up to logarithmic factors. We confirm this bound with (i) a hierarchical clustering algorithm, (ii) an end-to-end neural network whose predictor-clusterer modules predict their own latents at each level via gradient descent, and (iii) the first sample-complexity anal- ysis of data2vec, which we show implicitly performs hierarchical latent prediction. This suggests that explicit stacking such as H-JEPA is largely redundant.














