Code Monkey home page Code Monkey logo

videogan's Introduction

Generating Videos with Scene Dynamics

This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirsiavash, Antonio Torralba, to appear at NIPS 2016. The model learns to generate tiny videos using adversarial networks.

Example Generations

Below are some selected videos that are generated by our model. These videos are not real; they are hallucinated by a generative video model. While they are not photo-realistic, the motions are fairly reasonable for the scene category they are trained on.

Beach


Golf


Train Station


Baby


Training

The code requires a Torch7 installation.

To train a generator for video, see main.lua. This file will construct the networks, start many threads to load data, and train the networks.

For the conditional version, see main_conditional.lua. This is similar to main.lua, except the input to the model is a static image.

To generate videos, see generate.lua. This file will also output intermediate layers, such as the mask and background image, which you can inspect manually.

Data

The data loading is designed assuming videos have been stabilized and flattened into JPEG images. We do this for efficiency. Stabilization is computationally slow and must be done offline, and reading one file per video is more efficient on NFS.

For our stabilization code, see the 'extra' directory. Essentially, this will convert each video into an image of vertically concatenated frames. After doing this, you create a text file listing all the frames, which you pass into the data loader.

Models

You can download our pre-trained models here (1 GB ZIP file).

Notes

The code is based on DCGAN and our starter code in Torch7.

If you find this useful for your research, please consider citing our NIPS paper.

License

MIT

videogan's People

Contributors

cvondrick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

videogan's Issues

Pre-trained models for future generation?

It seems that the models you released only contain the the two streams part and the discriminator. Can you also share the image encoder part for future generation? (And also the model of baby videos, if possible)

Thanks!

PyTorch version of code

Is there a PyTorch version of the code or do the authors of the paper wish to release it?

It might be helpful to use it.

Question of UCF101

Hi, Carl!
Wonderful job! I just want to know where can I download the file mentioned in your code: /data/vision/torralba/hallucination/UCF101/gan/train.txt

Best regards.

Publish datasets as torrent files

The article says there's 7 TB of files, they could be published as a set of torrent files that could be community-seeded from different machines.

It makes sense to split them to amounts individually downloadable to a typical machine (e.g. 7 torrents for 1 tb each)

Issues running main.lua

Hi

When I try to run main.lua I get this error message :

/data/vision/torralba/crossmodal/flickr_videos/scene_extract/lists-full/_b_beach.txt.train : No such file or directory

Do I have to manually download all the txt file of your project or is there a lua script in the project, which downloads all the data ?

Confusion about conditional model generation

Hi @cvondrick I'm trying to generate a conditional model with main_conditional.lua. The README.md says that a conditional model is trained on a single static image. So, I created a job_list.txt containing a single line referencing a single jpeg file, and ran stabilize_videos_many.py, which extracted a single scene. I then ran main_conditional.lua against the extracted scene. This resulted in 100 iterations through the dataset, and then the program stopped, without saving a model. It looks like main_conditional.lua expects to run 1000 iterations through the dataset, but when I run main_conditional.lua against a single scene, it only run 100 iterations.

Is there something I should be doing differently? Should I run main_conditional.lua against several identical scenes to mimic a larger dataset? Should I simply set niter to a higher number, or saveIter to a lower one?

Thanks to you and your colleagues for sharing this great work!

Output size of generate.lua

I just ran generate.lua and I get a gif file, which size is 384 pixels and 1408 pixels, which mean that 6 gifs have been stacked horizontally and 22 gifs vertically. I checked the code, and nothing suggests to me, where these stacking is happening. What do I have to change, that the output is only a 64x64 image ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.