go
9×9 · chinese scoring · komi 5.5
Go is the canonical success story for Monte Carlo Tree Search. Unlike chess, where random play produces nonsense, random Go games carry meaningful signal — enough for MCTS to build a competent strategy from pure simulation.
The AI uses UCB1 selection with random rollouts to completion. A "don't fill own eyes" heuristic keeps rollouts from self-destructing. With ~5,000 iterations per move, it plays at a weak-intermediate level on the 9×9 board.
UCB1(i) = w̄ᵢ + c · √(ln N / nᵢ)
Chinese area scoring: each player's score = own stones on the board + empty points surrounded entirely by own stones. White receives 5.5 points komi to compensate for black's first-move advantage.
references
Silver et al. "Mastering the game of Go with deep neural networks and tree search." Nature, 2016.
Gelly & Silver. "Combining online and offline knowledge in UCT." ICML, 2007.
how the search tree works
MCTS builds a game tree incrementally. Each iteration follows four phases:
1. Selection — descend the tree, choosing the child with the highest UCB1 score at each level. UCB1 balances win rate (exploitation) with visit count (exploration).
UCB1 = w̄ + c · √(ln N / n)
2. Expansion — at a node with untried moves, add one as a new leaf.
3. Simulation — play random moves from the new leaf until the game ends. In Go, random rollouts naturally carry meaningful signal about territory.
4. Backpropagation — propagate the win/loss result back up to the root, updating each ancestor's statistics.
The tree above shows the top branches after the AI's search. More visits = more confidence in that line of play.