Charles and Richard's branch

Can Convolutional Neural Networks Crack Sudoku Puzzles?

Sudoku is a popular number puzzle that requires you to fill blanks in a 9X9 grid with digits so that each column, each row, and each of the nine 3×3 subgrids contains all of the digits from 1 to 9. There have been various approaches to solving that, including computational ones. In this project, we show that simple convolutional neural networks have the potential to crack Sudoku without any rule-based postprocessing.

Requirements

NumPy >= 1.11.1
TensorFlow == 1.1

Background

To see what Sudoku is, check the wikipedia
To investigate this task comprehensively, read through McGuire et al. 2013.

Dataset

1M games were generated using generate_sudoku.py for training. I've uploaded them on the Kaggle dataset storage. They are available here.
30 authentic games were collected from http://1sudoku.com.

Model description

10 blocks of convolution layers of kernel size 3.

File description

generate_sudoku.py create sudoku games. You don't have to run this. Instead, download pre-generated games.
hyperparams.py includes all adjustable hyper parameters.
data_load.py loads data and put them in queues so multiple mini-bach data are generated in parallel.
modules.py contains some wrapper functions.
train.py is for training.
test.py is for test.

Training

STEP 1. Download and extract training data.
STEP 2. Run python train.py. Or download the pretrained file.

Test

Run python test.py.

Evaluation Metric

Accuracy is defined as: Number of blanks where the prediction matched the solution / Number of blanks.

Next Steps

// Improve CNN

epoch nums
filter size
regularization
dropout
more tainining data (medium, hard, expert, evil)

// Search Problem

Use prob distributions in heuristic search

// RNN (Bidirectional with LSTM cells)

list of tuples (blank_cell, num_hints) where num_hints = sum of hints in row,col,square
sort list by num_hints
feed in row, col, sqaure of cell with highest num_hints to RNN
take joint prob for cell as solution
update board with prediction
recompute list with newly predicted cell
choose new largest num_hints cell and continue
(build structure)

// Deep RL

very difficult
we have structure for it though

Visualization expected board --> our board

// Input should now be probability distribution

lets make it all one-hot
back to 9 labels
0 has equal probability for all labels

bi - directional lstm go across the row, column, square as individual inputs to the lstm each pos in row x batch x

only do softmax at end

feed forward layer to translate the outputs of row, col, sqaure (features) to softmax probability feed forward to softmax layer

becomes probabilities after loss combine 3 outputs then feed forward then loss

richardmantey / sudoku-1 Goto Github PK

sudoku-1's Introduction

Charles and Richard's branch

Can Convolutional Neural Networks Crack Sudoku Puzzles?

Requirements

Background

Dataset

Model description

File description

Training

Test

Evaluation Metric

Next Steps

sudoku-1's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent