To generate game data for 40 9 ร 9 Go games and store features in features.npy, and labels in labels.npy, run:
python generate_mcts_games.py -n 40 --board-out features.npy --move-out labels.npy
Convolution network is built using with max pooling layer, cross-entropy loss, softmax activation for last layers, dropout for regularization and rectified linear unit activation function.
python ./dlgo/mcts_go_cnn_simple.py
This seems low, but it is important to note that the base line for randomly guessing on a 9x9 board is about 1.2% accuracy. We can do better than 8% though. The reason the accuracy is so low for this approach is because the training data is generated by two bots randomly playing against each other. If any move results in one bot winning in a significantly higher share of games, then it is more likely to be a good move.
Since Go has a higher degrees of freedom and it takes a long time to generate data and make decisions, it is import to use optimization techniques such as zobrist hashing and pruning techniques, such as alpha-beta and position evaluation, to reduce the total number and time per games.
The general approach is inspired by the book "Deep Learning And The Game of Go" by Kevin Ferguson and Max Pumperla.