Code Monkey home page Code Monkey logo

neurosat's Introduction

NeuroSAT

NeuroSAT is an experimental SAT solver that is learned using single-bit supervision only. We train it as a classifier to predict satisfiability of random SAT problems and it learns to search for satisfying assignments to explain that bit of supervision. When it guesses sat, we can almost always decode the satisfying assignment it has found from its activations. It can often find solutions to problems that are bigger, harder, and from entirely different domains than those it saw during training.

Specifically, we train it as a classifier to predict satisfiability on random problems that look like this:

When making a prediction about a new problem, it guesses unsat with low confidence (light blue) until it finds a satisfying assignment, at which point it guesses sat with very high confidence (red) and converges:

Iteration →

At convergence, the literal embeddings cluster according to the solution it finds:

We can almost always recover the solution by clustering the literal embeddings, thus making NeuroSAT an end-to-end SAT solver.

At test time it can often find solutions to

  • bigger random problems:

  • graph coloring problems:

  • clique detection problems:

  • dominating set problems:

  • and vertex cover problems:

Caveats

  • The graph problems are derived from small random graphs (~10 nodes, ~17 edges on average).
  • NeuroSAT is a research prototype and is still vastly less reliable than traditional SAT solvers.

Reproducibility

As many readers know too well, facilitating exact reproducibility in machine learning can require a lot of work. NeuroSAT is no exception. We regret that we do not currently provide a push-button way to retrain our exact model on the exact same training data we used in our experiments, though we may provide such functionality in the future depending on the level of interest. For now, we settle for providing our model code, a generator for the distribution of problems we trained on, and enough scaffolding to easily train and test it on small datasets. More utilities will be added in the coming weeks. We hope users will adapt our code to their own infrastructures, improve upon our model, and train it on a greater variety of problems.

Playing with NeuroSAT

The scripts/ directory includes a few scripts to get started.

  1. setup.sh installs dependencies.
  2. toy_gen_data.sh generates toy train and test data.
  3. toy_train.sh trains a model for a few iterations on the toy training data.
  4. toy_test.sh evaluates the trained model on the toy test data.
  5. toy_solve.sh tries to solve the toy test problems.
  6. toy_pipeline.sh runs toy_gen_data.sh, toy_train.sh, toy_test.sh, and toy_solve.sh in sequence.

These scripts can be easily modified to train and test on larger datasets.

Resources

More information about NeuroSAT can be found in the paper https://arxiv.org/abs/1802.03685.

Team

Acknowledgments

This work was supported by Future of Life Institute grant 2017-158712.

neurosat's People

Contributors

dselsam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neurosat's Issues

Provide data set

Hi
It'd be useful if you could provide an example dataset at least for the toy example. The training script for this one is referring to data/train/sr5.

Thanks
Ben.

`python/testate.py` does not exist

scripts/toy_test.sh references the file python/testate.py which does not exist -- is this supposed to be python/validate.py?

Thanks
Ben

Can't get normal result and I am confused.

Sorry for interrupting you when you are busy with working, I'm trying to modify this network to do some experiment, however I can't get normal result. I haven changed the size of problem by modify gen_data.py,but the result are as the same. For example, the loss are always 0.6931 and the matrix are always 50% accuracy. I wonder how can I get some normal result!

Training
Loading data/train/sr5/data_dir=grp1_npb=60000_nb=8.pkl...
[0] 0.6932 (0.30, 0.20, 0.30, 0.20) [42s]
Start Trian
Loading data/train/sr5/data_dir=grp2_npb=60000_nb=10.pkl...
[1] 0.6932 (0.25, 0.25, 0.25, 0.25) [43s]
Start Trian
Loading data/train/sr5/data_dir=grp3_npb=60000_nb=6.pkl...
[2] 0.6932 (0.20, 0.30, 0.20, 0.30) [43s]
Start Trian
Loading data/train/sr5/data_dir=grp8_npb=60000_nb=7.pkl...
[3] 0.6932 (0.30, 0.20, 0.30, 0.20) [53s]
Start Trian
Loading data/train/sr5/data_dir=grp9_npb=60000_nb=7.pkl...
[4] 0.6932 (0.20, 0.30, 0.20, 0.30) [44s]

Test:
data/test/sr5/data_dir=grp8_npb=60000_nb=9.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp2_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp10_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp5_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp1_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp9_npb=60000_nb=10.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp3_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp6_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp4_npb=60000_nb=8.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)
data/test/sr5/data_dir=grp7_npb=60000_nb=9.pkl 0.6932 (0.50, 0.00, 0.50, 0.00)

Guidance on reproducing experiments in paper

Hi --

I see the disclaimer about how this repo doesn't include the code to reproduce the experiments in the paper, but are you able to sketch out what I'd have to do to reproduce some of those experiments? (In particular, I'm most interested in reproducing the results in Table 2, where you show that the learned solver can be applied to SAT-encoded version of other NP problems, but also in the SR(U(40)) experiments)

EDIT: More specifically, a couple of things that could help get me off the ground -- For the experiments described in Table 1, how many problem instances did you train on? How many epochs of training?

Thanks
Ben

Something confused about the max_nodes_per_batch parameters

I find that in the toy examples, you set this hyper-parameter as 60000, however, in the paper, you set this as 12000, which is smaller than 60000.

From my understanding, more nodes mean more expressive representation. I am wondering if my understanding correct? And why you set this hyper-parameter in the toy examples.

Thanks!

About PCA

Hello, I notice you used PCA to get knowledge of what's happening during iterations. So I wonder if it's necessary to do PCA to the data, then use k-means to decode, or just use k-means.

I am new to this field, sorry to bother you if I propose a stupid question.

Procedure to generate `different problems` in NeuroSAT

Hi,

Is there any script which can illustrate the procedure to generate different problems, such as six different random graph distributions and graph coloring problems (3 ≤ k ≤ 5), dominating-set problems (2 ≤ k ≤ 4)), clique-detection problems (3 ≤ k ≤ 5), and vertex cover problems (4 ≤ k ≤ 6).

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.