Blackjack ML Simulator — Strategy Engine in Python, Rust & ML
Overview
This project is a blackjack research engine built to settle betting-strategy arguments with data instead of folklore. It plays the game three ways — a readable Python object model, an embedded Electron GUI for watching hands deal in real time, and an ultra-fast Rust Monte-Carlo core that simulates 340 million hands in ~16 seconds (≈34M hands/sec on a laptop). Every hand’s full state is logged so two machine-learning models can be trained on top of the output.
The headline question: across 17 playing/betting strategies and 9 bankroll tiers from $500 to $25,000, which approach actually reaches a 5× goal before busting? The answer is counterintuitive, and the simulator proves it at a scale no amount of hand-play could.
Each implementation earns its place: the Python model is the spec — clear classes for Shoe, Hand, GameEngine, and pluggable strategies. The Rust port exists purely for throughput; generating hundreds of millions of labeled hands for ML is only practical at tens of millions of hands per second. The JavaScript demo below is a fourth, browser-native port of the same decision logic so you can feel the math yourself.
Try it — basic strategy + card-counting trainer
The trainer below is a faithful browser port of the engine’s BasicStrategy and DeviationStrategy (the Illustrious 18 count-based deviations), running against a 6-deck shoe with Hi-Lo counting — exactly the logic in strategies.py and core.py. Play a hand and watch the recommended action update; when the true count is extreme enough to flip a basic-strategy decision, the counter’s deviation lights up.
The Count (Hi-Lo)
Recommended play
Leave the count running across several hands. When the true count climbs past +2 or +3, the deviation panel starts overriding basic strategy — and in the full simulator, that’s also when a real counter would be sizing their bets up. The whole edge of card counting lives in those moments.
Architecture
| Layer | Tech | Role |
|---|---|---|
| Core model | Python (core.py, actors.py, engine.py) |
Readable spec — Shoe, Hand, Dealer, GameEngine; H17 rules, splits, doubles, 3:2 blackjacks |
| Strategies | Python (strategies.py) |
Pluggable ActionStrategy / BettingStrategy classes — basic, deviations, Hi-Lo & Ace-Five counting, Martingale, Kelly, Fibonacci, Oscar’s Grind, and more |
| Fast core | Rust (rust_sim/) |
Multi-threaded Monte-Carlo via rayon; ~34M hands/sec; streams 26-field rows straight to CSV |
| GUI | Electron + Flask | Watch hands deal live; Chart.js bankroll curves for 5 archetype players |
| ML | scikit-learn | Action predictor (99.99% acc.) + bankroll-requirement predictor (gradient boosting) |
| This demo | Vanilla JS | Browser port of the decision engine (above) |
What the simulation found
Running 76,500 simulated players (17 strategies × 9 bankroll tiers × 500 each) over 340M hands, scored on whether each player 5×’d their bankroll before busting:
Best overall — COUNTER_MARTINGALE holds a remarkably flat 15–18% success rate at every bankroll size:
| Bankroll | 5× success | Bust |
|---|---|---|
| $500 | 17.6% | 82.4% |
| $1,000 | 14.6% | 85.4% |
| $5,000 | 17.0% | 83.0% |
| $10,000 | 17.8% | 82.2% |
| $25,000 | 15.8% | 84.2% |
The counterintuitive part: more money ≠ better odds
For most strategies, a smaller bankroll reaches the 5× goal more often, because it can ride a lucky variance spike before the house edge grinds it down:
BASIC_FLAT: 11.6% at $500 → 0% at $10,000+PAROLI: 17.6% at $500 → 0% at $15,000+OSCARS_GRIND: 15.2% at $500 → 0% at $25,000
The lone exception is KELLY_CRITERION, which gets better with more capital (10.8% at $25k vs 11.6% at $500) — mathematically optimal bet sizing needs room to work.
Strategies that never work (0% at every bankroll)
RANDOM_FLAT (random actions), DEALER_MIMIC (never doubles — leaves +EV on the table), and NEVER_BUST (stands on stiff hands vs high cards). The simulator quantifies exactly how much each bad habit costs.
This is a research and education project about probability, variance, and software design — not gambling advice. The clearest empirical lesson is that the overwhelming majority of outcomes, across every system tested, are busts. The house edge is real and patient.
The trained models
- Action predictor — learns basic strategy purely from logged hands. Inputs: hand value, soft flag, dealer upcard, true count. 99.99% accuracy (15 misses in 100,000 test hands), and 12/12 on a hand-built basic-strategy spot check.
- Bankroll-requirement predictor — a gradient-boosting model estimating the probability of hitting 5× given a strategy and starting bankroll, used to rank strategies head-to-head.
Source
The full Python engine, Rust simulator, Electron GUI, and ML training scripts live in the repo. The two trained .pkl models and generated CSVs are gitignored (regenerate locally) to keep the repo lean.
strategies.py — playing strategies
import random
from enum import Enum
class Action(Enum):
HIT = 'Hit'
STAND = 'Stand'
DOUBLE = 'Double'
SPLIT = 'Split'
SURRENDER = 'Surrender'
class ActionStrategy:
@property
def name(self) -> str:
return self.__class__.__name__
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
raise NotImplementedError
class RandomStrategy(ActionStrategy):
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
actions = [Action.HIT, Action.STAND]
if can_double:
actions.append(Action.DOUBLE)
if can_split:
actions.append(Action.SPLIT)
return random.choice(actions)
class BasicStrategy(ActionStrategy):
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
value = player_hand.value
is_soft = player_hand.is_soft
if can_split:
pair_value = player_hand.cards[0].value
if pair_value == 11 or pair_value == 8:
return Action.SPLIT
if pair_value in [2, 3, 7] and dealer_upcard_value <= 7:
return Action.SPLIT
if pair_value == 6 and dealer_upcard_value <= 6:
return Action.SPLIT
if pair_value == 4 and dealer_upcard_value in [5, 6]:
return Action.SPLIT
if pair_value == 9 and dealer_upcard_value not in [7, 10, 11]:
return Action.SPLIT
if is_soft:
if value <= 17:
if can_double and dealer_upcard_value in [5, 6]:
return Action.DOUBLE
return Action.HIT
elif value == 18:
if can_double and dealer_upcard_value in [3, 4, 5, 6]:
return Action.DOUBLE
elif dealer_upcard_value <= 8:
return Action.STAND
else:
return Action.HIT
else:
return Action.STAND
else:
if value <= 8:
return Action.HIT
elif value == 9:
return Action.DOUBLE if can_double and dealer_upcard_value in [3, 4, 5, 6] else Action.HIT
elif value == 10:
return Action.DOUBLE if can_double and dealer_upcard_value <= 9 else Action.HIT
elif value == 11:
return Action.DOUBLE if can_double and dealer_upcard_value <= 10 else Action.HIT
elif value == 12:
return Action.STAND if dealer_upcard_value in [4, 5, 6] else Action.HIT
elif 13 <= value <= 16:
return Action.STAND if dealer_upcard_value <= 6 else Action.HIT
else:
return Action.STAND
class DeviationStrategy(ActionStrategy):
"""
Advanced card counting strategy with Illustrious 18 deviations.
Adjusts basic strategy decisions based on the true count.
"""
def __init__(self):
self.true_count = 0 # Will be set before each decision
def set_true_count(self, tc: float):
self.true_count = tc
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
value = player_hand.value
is_soft = player_hand.is_soft
tc = self.true_count
# === ILLUSTRIOUS 18 DEVIATIONS ===
# Insurance: Take at TC >= +3 (handled elsewhere, but noted)
# 16 vs 10: Stand at TC >= 0 (normally hit)
if value == 16 and dealer_upcard_value == 10 and not is_soft:
if tc >= 0:
return Action.STAND
# 15 vs 10: Stand at TC >= +4 (normally hit)
if value == 15 and dealer_upcard_value == 10 and not is_soft:
if tc >= 4:
return Action.STAND
# 10 vs 10: Double at TC >= +4 (normally hit)
if value == 10 and dealer_upcard_value == 10 and can_double:
if tc >= 4:
return Action.DOUBLE
# 12 vs 3: Stand at TC >= +2 (normally hit)
if value == 12 and dealer_upcard_value == 3 and not is_soft:
if tc >= 2:
return Action.STAND
# 12 vs 2: Stand at TC >= +3 (normally hit)
if value == 12 and dealer_upcard_value == 2 and not is_soft:
if tc >= 3:
return Action.STAND
# 11 vs Ace: Double at TC >= +1 (normally hit)
if value == 11 and dealer_upcard_value == 11 and can_double:
if tc >= 1:
return Action.DOUBLE
# 9 vs 2: Double at TC >= +1 (normally hit)
if value == 9 and dealer_upcard_value == 2 and can_double:
if tc >= 1:
return Action.DOUBLE
# 10 vs Ace: Double at TC >= +4 (normally hit)
if value == 10 and dealer_upcard_value == 11 and can_double:
if tc >= 4:
return Action.DOUBLE
# 9 vs 7: Double at TC >= +3 (normally hit)
if value == 9 and dealer_upcard_value == 7 and can_double:
if tc >= 3:
return Action.DOUBLE
# 16 vs 9: Stand at TC >= +5 (normally hit)
if value == 16 and dealer_upcard_value == 9 and not is_soft:
if tc >= 5:
return Action.STAND
# 13 vs 2: Hit at TC <= -1 (normally stand)
if value == 13 and dealer_upcard_value == 2 and not is_soft:
if tc <= -1:
return Action.HIT
# 12 vs 4: Hit at TC <= 0 (normally stand)
if value == 12 and dealer_upcard_value == 4 and not is_soft:
if tc < 0:
return Action.HIT
# 12 vs 5: Hit at TC <= -2 (normally stand)
if value == 12 and dealer_upcard_value == 5 and not is_soft:
if tc <= -2:
return Action.HIT
# 12 vs 6: Hit at TC <= -1 (normally stand)
if value == 12 and dealer_upcard_value == 6 and not is_soft:
if tc <= -1:
return Action.HIT
# 13 vs 3: Hit at TC <= -2 (normally stand)
if value == 13 and dealer_upcard_value == 3 and not is_soft:
if tc <= -2:
return Action.HIT
# === FALL BACK TO BASIC STRATEGY ===
# Split logic
if can_split:
pair_value = player_hand.cards[0].value
if pair_value == 11 or pair_value == 8:
return Action.SPLIT
if pair_value in [2, 3, 7] and dealer_upcard_value <= 7:
return Action.SPLIT
if pair_value == 6 and dealer_upcard_value <= 6:
return Action.SPLIT
if pair_value == 4 and dealer_upcard_value in [5, 6]:
return Action.SPLIT
if pair_value == 9 and dealer_upcard_value not in [7, 10, 11]:
return Action.SPLIT
if is_soft:
if value <= 17:
if can_double and dealer_upcard_value in [5, 6]:
return Action.DOUBLE
return Action.HIT
elif value == 18:
if can_double and dealer_upcard_value in [3, 4, 5, 6]:
return Action.DOUBLE
elif dealer_upcard_value <= 8:
return Action.STAND
else:
return Action.HIT
else:
return Action.STAND
else:
if value <= 8:
return Action.HIT
elif value == 9:
return Action.DOUBLE if can_double and dealer_upcard_value in [3, 4, 5, 6] else Action.HIT
elif value == 10:
return Action.DOUBLE if can_double and dealer_upcard_value <= 9 else Action.HIT
elif value == 11:
return Action.DOUBLE if can_double and dealer_upcard_value <= 10 else Action.HIT
elif value == 12:
return Action.STAND if dealer_upcard_value in [4, 5, 6] else Action.HIT
elif 13 <= value <= 16:
return Action.STAND if dealer_upcard_value <= 6 else Action.HIT
else:
return Action.STAND
class DealerMimic(ActionStrategy):
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
# Hit until 17
value = player_hand.value
is_soft = player_hand.is_soft
if value < 17 or (value == 17 and is_soft):
return Action.HIT
return Action.STAND
class NeverBust(ActionStrategy):
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
if player_hand.value <= 11:
return Action.HIT
return Action.STAND
class ImperfectBasic(BasicStrategy):
def get_action(self, player_hand, dealer_upcard_value: int, can_split: bool = False, can_double: bool = False) -> Action:
if random.random() < 0.1:
return random.choice([Action.HIT, Action.STAND])
return super().get_action(player_hand, dealer_upcard_value, can_split, can_double)
class BettingStrategy:
@property
def name(self) -> str:
return self.__class__.__name__
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
raise NotImplementedError
class FlatBetting(BettingStrategy):
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
return min(min_bet, bankroll)
class CounterBetting(BettingStrategy):
"""
Realistic card counting bet spread strategy.
- Bets minimum when true count is at or below ramp_start (neutral/negative counts)
- Aggressively ramps bets as true count increases
- Uses configurable spread (default 1-12 units)
"""
def __init__(self, unit_spread: int = 12, ramp_start: float = 1.0):
self.unit_spread = unit_spread
self.ramp_start = ramp_start
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
unit = min_bet
if true_count <= self.ramp_start:
# Neutral or negative count: bet minimum
multiplier = 1
else:
# Aggressive ramp: +2 units per true count above threshold
# TC 2 → 3x, TC 3 → 5x, TC 4 → 7x, TC 5 → 9x, TC 6+ → 11-12x
excess = true_count - self.ramp_start
multiplier = min(self.unit_spread, 1 + int(excess) * 2)
return min(unit * multiplier, bankroll)
class MartingaleBetting(BettingStrategy):
"""
Martingale betting system.
- Start with minimum bet
- Double bet after each loss
- Return to minimum after a win
- Classic "chase losses" strategy
Warning: Can lead to very large bets after a losing streak.
"""
def __init__(self):
self.current_multiplier = 1
self.last_outcome_was_loss = False
def record_outcome(self, won: bool):
"""Called after each hand to track wins/losses."""
if won:
self.current_multiplier = 1 # Reset to minimum after win
else:
self.current_multiplier *= 2 # Double after loss
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
bet = min_bet * self.current_multiplier
return min(bet, bankroll) # Can't bet more than bankroll
class ProCounterBetting(BettingStrategy):
"""
Professional card counting betting strategy with Wonging.
Features:
- Wong out (sit out completely) when true count is below threshold
- Gradual 1-12 unit spread following Kelly-inspired sizing
- Bet increases only when edge justifies the risk
- Returns 0 to signal sitting out the hand
"""
def __init__(self, unit_spread: int = 12, wong_out_threshold: float = 1.0):
self.unit_spread = unit_spread
self.wong_out_threshold = wong_out_threshold # Sit out when TC below this
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
unit = min_bet
# Wong out: return 0 to sit out this hand entirely
# Only play when we have an edge (TC > 1)
if true_count < self.wong_out_threshold:
return 0 # Signal to skip this hand
# Gradual ramping based on true count
# Edge is roughly (TC - 1) * 0.5%, so bet proportionally
if true_count <= 2:
multiplier = 2
elif true_count <= 3:
multiplier = 4
elif true_count <= 4:
multiplier = 6
elif true_count <= 5:
multiplier = 8
elif true_count <= 6:
multiplier = 10
else:
multiplier = self.unit_spread # Max bet at TC 7+
return min(unit * multiplier, bankroll)
class ParoliBetting(BettingStrategy):
def __init__(self):
self.wins = 0
def record_outcome(self, won: bool):
if won:
self.wins += 1
if self.wins >= 3:
self.wins = 0
else:
self.wins = 0
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
bet = min_bet * (2 ** self.wins)
return min(bet, bankroll)
class System1326Betting(BettingStrategy):
def __init__(self):
self.wins = 0
self.sequence = [1, 3, 2, 6]
def record_outcome(self, won: bool):
if won:
self.wins += 1
if self.wins >= len(self.sequence):
self.wins = 0
else:
self.wins = 0
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
bet = min_bet * self.sequence[self.wins]
return min(bet, bankroll)
class FibonacciBetting(BettingStrategy):
def __init__(self):
self.fib = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
self.index = 0
def record_outcome(self, won: bool):
if won:
self.index = max(0, self.index - 2)
else:
self.index = min(len(self.fib) - 1, self.index + 1)
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
bet = min_bet * self.fib[self.index]
return min(bet, bankroll)
class OscarsGrindBetting(BettingStrategy):
def __init__(self):
self.units = 1
self.series_profit = 0.0
self.last_bet = 0.0
def record_outcome(self, won: bool):
if won:
self.series_profit += self.last_bet
if self.series_profit >= 0:
# series complete
self.units = 1
self.series_profit = 0.0
else:
self.units += 1
else:
self.series_profit -= self.last_bet
# units stay the same on loss
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
# Don't bet more than needed to recover
if self.series_profit < 0:
target_win_needed = abs(self.series_profit) + min_bet
desired_bet = min(min_bet * self.units, target_win_needed)
else:
desired_bet = min_bet * self.units
bet = min(desired_bet, bankroll)
self.last_bet = bet
return bet
class KellyCriterionBetting(BettingStrategy):
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
# edge = (true_count - 1) * 0.5%
edge = max(0.0, (true_count - 1) * 0.005)
# Assuming odds are 1:1, kelly fraction is edge / odds = edge
# We'll use quarter-kelly for safety
fraction = edge * 0.25
bet = bankroll * fraction
if bet < min_bet and true_count >= 1:
bet = min_bet
elif true_count < 1:
bet = min_bet # or 0 for wonging, but let's say min_bet
return min(bet, bankroll)
class WongAggressiveBetting(BettingStrategy):
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
if true_count < -1.0:
return 0.0
if true_count <= 1:
return min(min_bet, bankroll)
excess = true_count - 1.0
multiplier = min(15, 1 + int(excess) * 3)
return min(min_bet * multiplier, bankroll)
class AceFiveCounterBetting(BettingStrategy):
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
ace_five = kwargs.get('ace_five_count', 0.0)
if ace_five >= 2.0:
multiplier = min(8, int(ace_five))
return min(min_bet * multiplier, bankroll)
return min(min_bet, bankroll)
class AggressiveCounterBetting(BettingStrategy):
def get_bet(self, bankroll: float, min_bet: float, true_count: float, **kwargs) -> float:
if true_count <= 0:
return min(min_bet, bankroll)
multiplier = min(20, 1 + int(true_count) * 4)
return min(min_bet * multiplier, bankroll)
core.py — shoe & counting
import random
from enum import Enum
from typing import List, Optional
class Suit(Enum):
HEARTS = 'Hearts'
DIAMONDS = 'Diamonds'
CLUBS = 'Clubs'
SPADES = 'Spades'
class Rank(Enum):
TWO = '2'
THREE = '3'
FOUR = '4'
FIVE = '5'
SIX = '6'
SEVEN = '7'
EIGHT = '8'
NINE = '9'
TEN = '10'
JACK = 'J'
QUEEN = 'Q'
KING = 'K'
ACE = 'A'
class Card:
def __init__(self, rank: Rank, suit: Suit):
self.rank = rank
self.suit = suit
@property
def value(self) -> int:
if self.rank in (Rank.JACK, Rank.QUEEN, Rank.KING):
return 10
elif self.rank == Rank.ACE:
return 11 # Default, hands handle soft/hard logic
else:
return int(self.rank.value)
@property
def hi_lo_value(self) -> int:
if self.rank in (Rank.TWO, Rank.THREE, Rank.FOUR, Rank.FIVE, Rank.SIX):
return 1
elif self.rank in (Rank.SEVEN, Rank.EIGHT, Rank.NINE):
return 0
else:
return -1
@property
def ace_five_value(self) -> int:
if self.rank == Rank.ACE:
return -1
elif self.rank == Rank.FIVE:
return 1
return 0
def __repr__(self):
return f"{self.rank.value}{self.suit.value[0]}"
class Shoe:
def __init__(self, num_decks: int = 6, penetration: float = 0.75):
self.num_decks = num_decks
self.penetration = penetration
self.cards: List[Card] = []
self.running_count = 0
self.ace_five_count = 0
self.build_shoe()
self.shuffle()
# Calculate when to reshuffle
self.cut_card_index = int(self.num_decks * 52 * (1 - self.penetration))
def build_shoe(self):
self.cards = []
self.running_count = 0
self.ace_five_count = 0
for _ in range(self.num_decks):
for suit in Suit:
for rank in Rank:
self.cards.append(Card(rank, suit))
def shuffle(self):
random.shuffle(self.cards)
self.running_count = 0
self.ace_five_count = 0
def deal(self) -> Optional[Card]:
if not self.cards:
return None
card = self.cards.pop()
self.running_count += card.hi_lo_value
self.ace_five_count += card.ace_five_value
return card
def needs_reshuffle(self) -> bool:
return len(self.cards) <= self.cut_card_index
@property
def decks_remaining(self) -> float:
return max(1.0, len(self.cards) / 52.0) # Avoid division by zero, round up mostly
@property
def true_count(self) -> float:
# Standard calculation: round to nearest half deck, or just float division
return self.running_count / self.decks_remaining
@property
def ace_five_true_count(self) -> float:
decks = self.decks_remaining
if decks < 0.5:
return self.ace_five_count / 0.5
return self.ace_five_count / decks