Wednesday, January 19, 2022

It seems that AI has solved poker

Keith Romer, How A.I. Conquered Poker, NYTimes Magazine, 1.18.22.

Von Neumann:

Using his own simplified version of the game, in which two players were randomly “dealt” secret numbers and then asked to make bets of a predetermined size on whose number was higher, von Neumann derived the basis for an optimal strategy. Players should bet large both with their very best hands and, as bluffs, with some definable percentage of their very worst hands. (The percentage changed depending on the size of the bet relative to the size of the pot.) Von Neumann was able to demonstrate that by bluffing and calling at mathematically precise frequencies, players would do no worse than break even in the long run, even if they provided their opponents with an exact description of their strategy. And, if their opponents deployed any strategy against them other than the perfect one von Neumann had described, those opponents were guaranteed to lose, given a large enough sample.

The early days:

Unlike in chess or backgammon, in which both players’ moves are clearly legible on the board, in poker a computer has to interpret its opponents’ bets despite never being certain what cards they hold. Neil Burch, a computer scientist who spent nearly two decades working on poker as a graduate student and researcher at Alberta before joining an artificial intelligence company called DeepMind, characterizes the team’s early attempts as pretty unsuccessful. “What we found was if you put a knowledgeable poker player in front of the computer and let them poke at it,” he says, the program got “crushed, absolutely smashed.”

Partly this was just a function of the difficulty of modeling all the decisions involved in playing a hand of poker. Game theorists use a diagram of a branching tree to represent the different ways a game can play out. [...] For even a simplified version of Texas Hold ’em, played “heads up” (i.e., between just two players) and with bets fixed at a predetermined size, a full game tree contains 316,000,000,000,000,000 branches. The tree for no-limit hold ’em, in which players can bet any amount, has even more than that. “It really does get truly enormous,” Burch says. “Like, larger than the number of atoms in the universe.”

Then:

At first, the Alberta group’s approach was to try to shrink the game to a more manageable scale — crudely bucketing hands together that were more or less alike, treating a pair of nines and a pair of tens, say, as if they were identical. But as the field of artificial intelligence grew more robust, and as the team’s algorithms became better tuned to the intricacies of poker, its programs began to improve. Crucial to this development was an algorithm called counterfactual regret minimization. Computer scientists tasked their machines with identifying poker’s optimal strategy by having the programs play against themselves billions of times and take note of which decisions in the game tree had been least profitable (the “regrets,” which the A.I. would learn to minimize in future iterations by making other, better choices). In 2015, the Alberta team announced its success by publishing an article in Science titled “Heads-Up Limit Hold’em Poker Is Solved.” [...]

It quickly became clear that academics were not the only ones interested in computers’ ability to discover optimal strategy. One former member of the Alberta team, who asked me not to name him, citing confidentiality agreements with the software company that currently employs him, told me that he had been paid hundreds of thousands of dollars to help poker players develop software that would identify perfect play and to consult with programmers building bots that would be capable of defeating humans in online games. Players unable to front that kind of money didn’t have to wait long before gaining more affordable access to A.I.-based strategies. The same year that Science published the limit hold ’em article, a Polish computer programmer and former online poker player named Piotrek Lopusiewicz began selling the first version of his application PioSOLVER. For $249, players could download a program that approximated the solutions for the far more complicated no-limit version of the game. As of 2015, a practical actualization of John von Neumann’s mathematical proof was available to anyone with a powerful enough personal computer.

Still:

Koon is quick to point out that even with access to the solvers’ perfect strategy, poker remains an incredibly difficult game to play well. The emotional swings that come from winning or losing giant pots and the fatigue of 12-hour sessions remain the same challenges as always, but now top players have to put in significant work away from the tables to succeed. Like most top pros, Koon spends a good part of each week studying different situations that might arise, trying to understand the logic behind the programs’ choices. “Solvers can’t tell you why they do what they do — they just do it,” he says. “So now it’s on the poker player to figure out why.”

The best players are able to reverse-engineer the A.I.’s strategy and create heuristics that apply to hands and situations similar to the one they’re studying. Even so, they are working with immense amounts of information. When I suggested to Koon that it was like endlessly rereading a 10,000-page book in order to keep as much of it in his head as possible, he immediately corrected me: “100,000-page book. The game is so damn hard.”

And so:

Not every player I spoke to is happy about the way A.I.-based approaches have changed the poker landscape. For one thing, while the tactics employed in most lower-stakes games today look pretty similar to those in use before the advent of solvers, higher-stakes competition has become much tougher. As optimal strategy has become more widely understood, the advantage in skill the very best players once held over the merely quite good players has narrowed considerably. But for Doug Polk, who largely retired from poker in 2017 after winning tens of millions of dollars, the change solvers have wrought is more existential. “I feel like it kind of killed the soul of the game,” Polk says, changing poker “from who can be the most creative problem-solver to who can memorize the most stuff and apply it.”

Piotrek Lopusiewicz, the programmer behind PioSOLVER, counters by arguing that the new generation of A.I. tools is merely a continuation of a longer pattern of technological innovation in poker. Before the advent of solvers, top online players like Polk used software to collect data about their opponents’ past play and analyze it for potential weaknesses. “So now someone brought a bigger firearm to the arms race,” Lopusiewicz says, “and suddenly those guys who weren’t in a position to profit were like: ‘Oh, yeah, but we don’t really mean that arms race. We just want our tools, not the better tools.’”

No comments:

Post a Comment