Based on Pluribus · Brown & Sandholm · Science 2019

Game-Theoretically
Optimal Poker AI

A complete GTO poker solver built from the ground up — the same architecture as the AI that beat the world's best professionals. Open source. Browser-based. Runs in 75ms.

J J J
K K K
Q Q Q
10 10
A A
75ms
Avg decision
373
Feature dims
5
CFR stages
27
Tests passing
MIT
License
Architecture
Five stages. One solver.

Built progressively from foundational game theory through neural network approximation to real-time search — every algorithm readable and documented.

CFR Convergence — Live Demo
Nash equilibrium → EV = 0
King (strongest)
99%
Queen (mid)
82%
Jack (weakest)
34%
Nash EV
≈ 0
Stage 1 + 2
🧮
Vanilla CFR + MCCFR
Counterfactual Regret Minimization on Leduc Hold'em. 216 information sets, convergence to Nash equilibrium within 10,000 iterations in 2.2 seconds. External sampling variant runs 1.9× faster than full tree traversal.
Zinkevich et al. 2007
Stage 3
🗂️
Card Abstraction via EMD
k-means clustering with Earth Mover's Distance over equity histograms — capturing the strategic difference between made hands and draws, not just mean equity. 8 preflop / 12 flop / 12 turn / 8 river buckets.
EMD clustering
Stage 4
🧠
Deep CFR
Neural network approximation of counterfactual regret. 256-unit × 3-layer networks with LayerNorm, reservoir buffers with uniform sampling guarantee, and linear CFR weighting for 2× faster convergence.
Brown & Sandholm 2019
Stage 5
Real-Time Subgame Search
Depth-limited MCCFR at decision time, using the blueprint as a leaf-node oracle. Blueprint bootstrapping stabilizes early search. 75ms average decision time on CPU — the Pluribus technique.
Pluribus technique

Benchmarks
What the numbers show

Tournament evaluation across 300 duplicate hand pairs. Duplicate scoring controls for card luck — each deal played twice with agents swapping seats.

Matchup mBB / hand 95% CI Result
Blueprint vs Random +28,403 ±5,789 ✓ Significant
Search vs Random +28,134 ±5,686 ✓ Significant
Search vs Blueprint +31,798 ±5,615 ✓ Significant

mBB = milli-big-blinds per hand. Margins reflect comparison against a random baseline. Real-time search consistently outperforms blueprint-only play — the core result from Pluribus.


Quick Start
Up and running in minutes
terminal
# Clone and install git clone https://github.com/griff-ui/poker-ai.git && cd poker-ai pip install -r requirements.txt # Stage 1+2: Leduc Hold'em CFR (~2 seconds) python main.py --iterations 50000 --mode both # Stage 4: Deep CFR training (~30 min on CPU) python deep_cfr/run_convergence.py # Stage 5: Tournament evaluation python stage5/evaluate.py --hands 300
python — real-time search agent
from deep_cfr.game_engine import GameState, deal_hand from deep_cfr.networks import DeepCFRPlayer, MAX_ACTIONS from stage5.search import RealTimeAgent, SearchConfig, SearchMode players = [DeepCFRPlayer(p, GameState.feature_dim(), MAX_ACTIONS) for p in range(2)] for p in range(2): players[p].load(f'deep_cfr/checkpoints/player{p}_final') players[p].set_inference_mode() agent = RealTimeAgent(0, players, SearchConfig(mode=SearchMode.DEPTH_LIMITED)) action = agent.act(deal_hand()) print(f'GTO action: {action}')

Pricing
Two versions. One engine.

Choose the version that fits how you play or build.

PokerAI Pro — Commercial
$499 one-time
For developers, researchers, and teams building on top of the engine.

  • Everything in Individual
  • Pro developer interface with full diagnostics
  • Information set visualization
  • Feature vector and convergence charts
  • Configurable search depth and iterations
  • Commercial use license
  • API integration rights
  • Priority support
Get Commercial License →

Writing
From the build log
How I Built a Pluribus-Style Poker AI From Scratch
From vanilla CFR to real-time subgame search — a complete walkthrough of the architecture, the algorithms, the performance bottlenecks, and what actually made the difference. Every stage documented with working code.
Deep CFR · Card Abstraction · Real-Time Search · 2025