Othello Bot

I've always enjoyed othello, so I figured it would be a great game to apply various ML algorithms to, from a simple greedy search algorithm to training a reinforcement learning (RL) agent.

Othello Rules

The game starts with four discs placed in the center of the board in a square pattern. Each player takes turns placing a disc of their color on the board. A move is valid if it captures at least one of the opponent's discs. Capturing is done by trapping one or more of the opponent's discs between the disc just placed and another disc of the current player's color. The captured discs are then flipped to the current player's color. The game ends when neither player can make a valid move. The player with the most discs of their color on the board at the end of the game wins.

Play the game

In the javascript game below you can play against the following five different algorithms of increasing complexity



Random Choice

As the name suggests this opponent uses random choices to select its move, don't expect any impressive play here.

Greedy

This greedy opponent chooses the move which flips the most tiles. Whilst tempting, this sort of short term thinking also won't lead to very impressive play.

Greedy Tuned

Like the previous greedy opponent this algorithm seeks to maximise the immediate board state. However unlike the previous greedy opponent the immediate board state is combined with additional bonuses for certain strategic tiles. This weights in favour of edges and corners, whilst down weighting tiles adjacent to edges.

Minimax

The minimax is a tree search algorithm that explores all possible sequences, up to some pre-defined depth. The name arises from the definition of the optimal move as the maximised value relative to the players choice and minimised relative to the opponent's actions. The depth here is set to 4, you will notice that even this relatively shallow search becomes very computationally demanding. More efficient searches exist with algorithms such as alpha-beta pruning and principal variation search.

Minimax Tuned

Like the previous minimax opponent this opponent uses the minimax algorithm to search for the optimal move. However unlike the previous minimax opponent the board value is not just the number of tiles owned, but also weighted by the tile values.

I am working to include more options to the online version, using agents trained with RL techniques such as Q-Learning.

Implementation

If you're curious to see how I implemented this game and the opponents you can take a look at the code here.