Professional Poker Players vs AI
Competing effectively in poker has proven a difficult task for AI. Recently, advancements in reinforcement learning have allowed AI bots to compete effectively in multiplayer settings and teach the best new tricks.
Competing effectively in poker has proven a difficult task for AI. Recently, advancements in reinforcement learning have allowed AI bots to compete effectively in multiplayer settings and teach the best new tricks.
The game of poker has been a challenging problem in the field of Artificial Intelligence (AI) for years. While AI systems have had previous success at beating humans in games that are non-random and follow predefined rules (such as chess and Go), winning at a game of poker has proven to be more challenging because it requires reasoning based on hidden information.
Over the past two decades, we have seen steady progress in the ability of AI systems to play and win various forms of poker. This includes ‘DeepStack’ and ‘Liberatus’, which were built and developed at the University of Alberta in Edmonton, Canada and Carnegie Mellon University in Pittsburgh, USA, respectively. These systems were effective, however were limited to settings involving only two players. Developing AI for multiplayer poker was widely recognized as the major remaining milestone.
Pluribus is a robot that uses AI built and developed by Carnegie Mellon University in collaboration with Facebook's AI Lab. Pluribus plays no-limit Texas hold 'em poker and is widely known as ‘the first bot to beat humans in a complex multiplayer competition’, signifying a key milestone in AI.
As with many recent AI-game breakthroughs, Pluribus used the machine learning paradigm, reinforcement learning, to model and master multiplayer poker. Using self-play, Pluribus developed its strategy, in which the AI robot played against copies of itself, without any data of prior human or AI play used as input. In essence, Pluribus started by playing randomly, and gradually improved as it determined which actions, and which probability distribution over those actions, lead to better outcomes against earlier versions of its strategy.
Pluribus uses a technique called ‘abstraction’ during its decision-making process.
Abstraction is defined as: ‘the process of removing physical, spatial, or temporal details or attributes in the study of objects or systems to focus attention on details of greater importance’.
Abstraction is important to Pluribus’s decision-making process as multiplayer Texas hold 'em poker can have too many decision points to reason about individually. This process ensures that similar actions and decisions are grouped together, while other lesser decisions are eliminated, which reduces the scope of each decision. Pluribus’s current version has two types of abstractions embedded into its decision-making process:
Pluribus also uses a version of the iterative Monte Carlo algorithm. A simplified explanation of this decision-making technique can be described as follows:
Over time, this is how Pluribus is able to understand the strategies employed by each opponent, which it uses to defeat them.
In conclusion, Pluribus represents a key AI breakthrough in recent times. Although intentionally developed and implemented for poker, the general techniques employed (reinforcement learning, abstraction, iterative Monte Carlo etc.) can be applied in many other settings. Notably, professional poker players (including Darren Elias – who holds the most World Poker Tour titles) have used lessons learned from playing against and being defeated by, Pluribus to improve their own strategies.
Stay up to date with the latest AI news, strategies, and insights sent straight to your inbox!