go

9×9 · chinese scoring · komi 5.5

Go is the canonical success story for Monte Carlo Tree Search. Unlike chess, where random play produces nonsense, random Go games carry meaningful signal — enough for MCTS to build a competent strategy from pure simulation.

The AI uses UCB1 selection with random rollouts to completion. A "don't fill own eyes" heuristic keeps rollouts from self-destructing. With ~5,000 iterations per move, it plays at a weak-intermediate level on the 9×9 board.

UCB1(i) = w̄ᵢ + c · √(ln N / nᵢ)

Chinese area scoring: each player's score = own stones on the board + empty points surrounded entirely by own stones. White receives 5.5 points komi to compensate for black's first-move advantage.

references

Silver et al. "Mastering the game of Go with deep neural networks and tree search." Nature, 2016.

Gelly & Silver. "Combining online and offline knowledge in UCT." ICML, 2007.

live · mcts

black to play

iterations depth

how the search tree works

MCTS builds a game tree incrementally. Each iteration follows four phases:

1. Selection — descend the tree, choosing the child with the highest UCB1 score at each level. UCB1 balances win rate (exploitation) with visit count (exploration).

UCB1 = w̄ + c · √(ln N / n)

2. Expansion — at a node with untried moves, add one as a new leaf.

3. Simulation — play random moves from the new leaf until the game ends. In Go, random rollouts naturally carry meaningful signal about territory.

4. Backpropagation — propagate the win/loss result back up to the root, updating each ancestor's statistics.

The tree above shows the top branches after the AI's search. More visits = more confidence in that line of play.