Code Monkey home page Code Monkey logo

visual-interaction-networks_tensorflow's Introduction

Visual Interaction Networks

Tensorflow Implementation of Visual Interaction Networks from Google Deepmind.

Implementation is on Tensorflow r1.2.

https://arxiv.org/abs/1706.01433

"Another key part of relational reasoning involves predicting the future in a physical scene. From just a glance, humans can infer not only what objects are where, but also what will happen to them over the upcoming seconds, minutes and even longer in some cases. For example, if you kick a football against a wall, your brain predicts what will happen when the ball hits the wall and how their movements will be affected afterwards (the ball will ricochet at a speed proportional to the kick and - in most cases - the wall will remain where it is)." From an article of Deepmind

https://deepmind.com/blog/neural-approach-relational-reasoning/

alt tag

N-objects Gravity Simulations

For changing configure values, please check constants script.

cat constracts.py

For generating images and data,

python physical_engines.py

For modeling Visual Interaction Networks

python gravity_vin.py

Data

The Data are gathered from my own implemented physics engine, which is same of interaction network git repo. https://github.com/jaesik817/Interaction-networks_tensorflow

One different thing from IN physics engines is time difference between frames, which was 0.0001 in IN repo and is 0.001 in this. Because 0.0001 secs frame cannot be recognized in 32 x 32 images.

Settings

Settings are written in constants.py and gravity_vin.py. The number of objects, frames on each simulations and rollout frames are 3, 50 and 20. The training max epoches are 80000. In physical_engines code, every frames are saved as image and coded data, and those things are used in gravity_vin script. Each image has background ones from CIFAR 10 training data set as the paper.

Results

The loss decreased as followed, which is summarized value of losses on near future 8 frames and encoding-decoding losses on input images.

alt tag

The quilititive results are as followed.

True : alt tag

Modeling : alt tag

visual-interaction-networks_tensorflow's People

Contributors

jsikyoon avatar pypatel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

visual-interaction-networks_tensorflow's Issues

Implementation Difference

Hello,
Thank you for implementation. I was checking your code and i saw some difference at implementation between paper and your code. for example: according to the supplementary material you should first concatenate and apply a linear layer, but you are for applying a linear layer and then doing the concatenation op why do you need this difference?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.