Crushing a top HUNL poker bot

At Ruse, we are proud to have developed the world’s strongest AI poker engine, which has beaten Slumbot - a superhuman poker bot and winner of the most recent Annual Computer Poker Competition - for the highest win rate ever recorded, a massive 19.4BB/100. To better understand the implications of this achievement allow us to introduce AI poker agents and how they are evaluated.

Evaluating AI poker bots

Game theory optimal (GTO) solvers attempt to approximate a Nash equilibrium strategy. By following a Nash equilibrium strategy, you are guaranteed to not lose in expectation regardless of your opponent’s strategy. Nash distance, often referred to as exploitability, measures how close a given strategy is to the Nash equilibrium strategy. Without constraints, the game of Heads-Up No-Limit Hold’em consists of 10160 decision points, which is more than the number of atoms in the universe (1082). The immense size of the game makes it impossible to compute the Nash distance while starting from preflop and considering all bet sizes. Rather than attempting to simplify this calculation, poker bots can measure their performance head-to-head, competing against each other in a real poker match.

Slumbot

Developed by the independent researcher Eric Jackson, Slumbot is the most recent champion of the Annual Computer Poker Competition (ACPC). Originally founded by the University of Alberta and Carnegie Mellon and held annually from 2006 to 2018, the ACPC provided an open and international venue for benchmarking computer poker bots. Much like a solver, Slumbot attempts to play according to an approximate Nash equilibrium. It does not adapt its strategy nor attempt to exploit the errors of its opponents. At its core, the poker bot uses a variant of the approximate equilibrium finding algorithm, Counterfactual Regret Minimization (CFR), the same algorithm used in commercial solvers such as PioSOLVER. Expert in the 200BB Heads-Up No-Limit Hold’em format, Slumbot considers similar combinations of cards as strategically equivalent and uses a betting abstraction, i.e. a restricted number of bet sizes when solving. To compute its strategy, it used about 250,000 core hours and 2 TB of RAM. When playing, it plays according to this gigantic pre-computed solution and maps the observed action and bet size of its opponent to one or more nearby bet sizes within its abstraction.

Ruse

Using the latest advances in game-theory and AI, we have developed Ruse, a general approach poker agent that converges to a Nash equilibrium, reaching superhuman-level performance. Contrary to Slumbot and traditional poker bots, Ruse does not compute and store a complete strategy prior to play. Instead, through deep reinforcement learning, it considers each particular situation as it arises during play and solves it in real-time, in a matter of seconds. Much like any poker player, Ruse’s poker knowledge and intuition needed to be trained, which it accomplished by playing hundreds of millions of hands against successively better versions of itself without any human intervention. Starting from random play and throughout its matches, Ruse gradually learned which plays lead to the highest expected value. Ruse learned the optimal strategy for various game depths by encountering a wide range of scenarios, making it a general approach poker agent capable of solving games of any stack size. Thanks to its use of neural networks, Ruse is able to process all relevant information in just a few seconds, translating to a blazing fast acting time.

Results

To mitigate the effect of variance, Ruse played 150,000 hands against Slumbot, while adhering to the rules of the Annual Computer Poker Competition. These rules restrict Ruse’s average acting time to 7 seconds per hand and resets the stack size to 200BB after each hand. Despite being a general approach poker bot designed to solve games of any stack size, Ruse achieved the best win rate ever recorded against Slumbot in its format of expertise, an astounding 19.4BB/100, while respecting ACPC’s constraints. If the stakes of this match were $50/$100 with 200 hands played per hour (a relatively standard rate when playing online across multiple tables), Ruse would have won $19.4 per hand and about $3880/hour.

Ruse's chips won against Slumbot

Thanks to Slumbot’s public access and open API, other researchers were also able to benchmark their poker agents against it. That list includes:

  • A reimplementation of DeepStack, a 2017 AI poker bot developed by the University of Alberta, which claimed victory over elite human HUNL players
  • Supremus, a top AI poker bot co-developed by the ex-high stakes poker player Bryan Pellegrino and used by the professional HUNL player Doug Polk, in preparation for his challenge against Daniel Negreanu
  • ReBeL, a general approach poker bot developed by Noam Brown et al. in 2020, which achieved superhuman performance in HUNL, while using less domain knowledge than previous poker AIs

Here we report the head-to-head results of Ruse and other expert-level bots against Slumbot.

Head-to-head results against Slumbot. The ± shows one standard deviation.

A New Era in Training

By making the power of Ruse available through an intuitive web interface, we are bringing the next-generation of poker studying tools to the market and enabling poker professionals to gain a competitive edge over their opponents. Interested in competing against the best AI poker engine in the world or in using the hand history of Ruse’s match against Slumbot for your own analysis or to create content? Reach out to us at info@ruse.ai 

✨ Join our community on