Thursday, August 14, 2025

Nate Silver on AI and AI as superforcasters

Nate Silver on Life’s Mixed Strategies, Conversations with Tyler, August 13, 2025.

In his third appearance on Conversations with Tyler, Nate Silver looks back at past predictions, weighs how academic ideas such as expected utility theory fare in practice, and examines the world of sports through the lens of risk and prediction.

Tyler and Nate dive into expected utility theory and random Nash equilibria in poker, whether Silver’s tell-reading abilities transfer to real-world situations like NBA games, why academic writing has disappointed him, his move from atheism to agnosticism, the meta-rationality of risk-taking, electoral systems and their flaws, 2028 presidential candidates, why he thinks superforecasters will continue to outperform AI for the next decade, why more athletes haven’t come out as gay, redesigning the NBA, what mentors he needs now, the cultural and psychological peculiarities of Bay area intellectual communities, why Canada can’t win a Stanley Cup, the politics of immigration in Europe and America, what he’ll work on next, and more.

Here's most of the discussion of AI:

COWEN: Now, speaking of predictions, a year ago, we talked about how long will it take AIs to be as good as human superforecasters? You made a prediction, where you said at least 10 to 15 years. Now, a year later, do you want to revisit that and revise?

SILVER: I think it’s probably about right. I would say, relative to a year ago, AI is about at the 40th pe rcentile of progress I would’ve expected. I’d be curious what you would think.

COWEN: A year ago, I said two to three years. Right now, I’m going to say one to two years, which is the same prediction. I think you’re way too pessimistic in your timetable.

SILVER: It depends on how competitive the exercise is. If it’s like a —

COWEN: Like a Math Olympiad tournament. They just did gold medal performance. I said this last year. I said, “In a year, they’re going to do gold medal.” A year ago, they weren’t sure how many r’s were in the word strawberry. You don’t think on superforecasting, they can —

SILVER: I think it’s very different when you’re dealing with a static problem as compared to a dynamic system where the inputs are changing all the time. Currently, the large language models are very, very bad at poker. They’re not trained on poker data. I’m sure if you did train them . . .

There are these things called solvers that are trained on poker data that do very well, but they cannot quite impute the general patterns just from mediocre text data or an amateurish kind of hand analysis. If you probe them on why they’re bad, they’re like, “Yes, maybe it’s tough for us when you have a complicated, evolving game theory dynamic, and you have to develop exploitative strategies very quickly.”

If you have a computer solve a poker hand to get to one where there’s enough loss minimization, a very strong computer can take minutes, whereas a poker player makes those calculations implicitly in a handful of seconds, for example. I don’t know. I worry, with the Math Olympiad stuff — there’s a little bit of teaching to the test where, because you set this as the goal that a large language model should have, that therefore there’s a lot of prestige when you meet that goal, potentially —

COWEN: Isn’t teaching to the test what we should do, even with humans in a sense. The test is what you think is important, and that’s what you ought to teach.

SILVER: Well, but the poker example, or chess — I think AI models are very poor at chess, from everything that I’ve heard, for example.

COWEN: Other AI models play chess great.

SILVER: Correct. This gets me to the question of, what’s it mean to be generally intelligent? We’ll probably have scaffolding of model on top of model, and you’ll now patch different things. Right now, you can’t really very effectively make a plane reservation using ChatGPT, but I’m sure if you dedicate a resource to that, then you have these agentic models now, or agent models are just creeping into the system, a little bit.

COWEN: Those will work in less than a year, I think. There’s an agentic model now from OpenAI.

SILVER: This is where I get to rough AGI versus superintelligence. I am less convinced that we’re going to have some intelligence explosion than I would have been maybe . . . I don’t think I was ever convinced of it, but this emergent superintelligence, where you train it on relatively simple data and it extrapolates beyond the data set. I think they do reason. Sometimes I think they’re quite smart, and I no longer am bashful about saying, “Oh, ChatGPT thinks this.” I used to avoid that term, think.

But there’s a big gap between approximate general intelligence for desk jobs, and then superintelligence on the one hand, or AGI for physical labor on the other hand. I think people are much too quick to make that leap. I think the Math Olympiad, in part because maybe the answers are somewhere latent in the training data, but even if they’re not, if you try to solve, what’s a Lucas critique? Whatever else, right? There’s a version of that, I think, for AI models.

COWEN: My intuition is that if you took five superforecasters and just had them write a five-page prompt for GPT-5, which will be out this summer, that we’d be there already. I don’t think it would be superintelligence. You could say it’s not AGI. But the human superforecasters, they’re not that impressive. They’re not Einsteins. They just have good methods, and they’re disciplined.

There's much more in the discussion.

No comments:

Post a Comment