cvondrick / videogan Goto Github PK

Generating Videos with Scene Dynamics. NIPS 2016.

Home Page: http://web.mit.edu/vondrick/tinyvideo/

Lua 85.63% Python 14.37%

computer-vision video generative-adversarial-network deep-learning

videogan's Introduction

Generating Videos with Scene Dynamics

This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirsiavash, Antonio Torralba, to appear at NIPS 2016. The model learns to generate tiny videos using adversarial networks.

Example Generations

Below are some selected videos that are generated by our model. These videos are not real; they are hallucinated by a generative video model. While they are not photo-realistic, the motions are fairly reasonable for the scene category they are trained on.

Beach	Golf
Train Station	Baby

Training

The code requires a Torch7 installation.

To train a generator for video, see main.lua. This file will construct the networks, start many threads to load data, and train the networks.

For the conditional version, see main_conditional.lua. This is similar to main.lua, except the input to the model is a static image.

To generate videos, see generate.lua. This file will also output intermediate layers, such as the mask and background image, which you can inspect manually.

Data

The data loading is designed assuming videos have been stabilized and flattened into JPEG images. We do this for efficiency. Stabilization is computationally slow and must be done offline, and reading one file per video is more efficient on NFS.

For our stabilization code, see the 'extra' directory. Essentially, this will convert each video into an image of vertically concatenated frames. After doing this, you create a text file listing all the frames, which you pass into the data loader.

Models

You can download our pre-trained models here (1 GB ZIP file).

Notes

The code is based on DCGAN and our starter code in Torch7.

If you find this useful for your research, please consider citing our NIPS paper.

License

MIT

videogan's People

Contributors

Stargazers

Watchers

Forkers

caomw tomasian mycodebox rtoya wanjinchang yenchenlin benjamesbabala ominux anu003 1165048017 dorniwang olejardamir theolivenbaum ilovecv coocoky zhixinshu hyzcn kerabromsmu arnabgho applesleam eriche2016 peterzs oiclid chagge raminia leezqcst vyraun sakamotoeriko xuqy1981 shyamalschandra ossa619 ohyeslk oztc thinhqt ml-lab quanfang chenbangfeng ionvision eivado artistic-ai mmmika rzel fengqian1989 peratham shihmengli prannayk michaelclifford sxjscience kjeanclaude xjwxjw tsingzao bernhard2202 winjia cabosanlucas vivoutlaw kevinwenya sawon1234 liketheflower barnonewdm dimajoyti bottlecapper millerhooks shuolongbj oupoqw arnabkar f205-cv xhuvom myungsub midasc cryptedp mohanarunachalam matthewnetwork cash2one wmonica coco2dslesson mengmengbai lulllabs yanhedewang zcrwind windowsyuli shubhampachori12110095 shjung13 rsantana-isg spadesq shannonwang1995 haoyuanz13 startime-h giserh dwqy11 gv1028 esmaeilinia kekedan maxenglander isr-wang hbcbh1999 afcarl shehabk evanliu1 pandinosaurus ltumorsegmentationtest

videogan's Issues

UCF101 Pre-trained model for Future prediction

Can you share the pre-trained model on UCF101 for the conditional generation?

Pre-trained models for future generation?

It seems that the models you released only contain the the two streams part and the discriminator. Can you also share the image encoder part for future generation? (And also the model of baby videos, if possible)

Thanks!

Pre-trained models for condition generation?

Hi Carl,can you also share the pretrained model for future generation? （This is similar to the request made by @Yuliang-Zou and @17Skye17 before）Before you said this can be done, maybe you forgot to update it ？ This will be very helpful to me.Thank you so much~

PyTorch version of code

Is there a PyTorch version of the code or do the authors of the paper wish to release it?

It might be helpful to use it.

How to download the files from golf.txt?

Can you please give an example of full URL for downloading files from the golf.txt mentioned on your website?

http://data.csail.mit.edu/videogan/golf.txt

Question of UCF101

Hi, Carl！
Wonderful job! I just want to know where can I download the file mentioned in your code: /data/vision/torralba/hallucination/UCF101/gan/train.txt

Best regards.

Publish datasets as torrent files

The article says there's 7 TB of files, they could be published as a set of torrent files that could be community-seeded from different machines.

It makes sense to split them to amounts individually downloadable to a typical machine (e.g. 7 torrents for 1 tb each)

can you provide pretrained model for conditional generation?

Hello,Carl. : )
Can you provide pretrained model for conditional generation?
Thank you. @cvondrick

Issues running main.lua

When I try to run main.lua I get this error message :

/data/vision/torralba/crossmodal/flickr_videos/scene_extract/lists-full/_b_beach.txt.train : No such file or directory

Do I have to manually download all the txt file of your project or is there a lua script in the project, which downloads all the data ?

How can I make this work for smaller size inputs?

So I want to run this on 32x32 frames? What all changes will I have to make for this to work?
Is there any short way to do it or will I have to hack through the entire code?

Confusion about conditional model generation

Hi @cvondrick I'm trying to generate a conditional model with main_conditional.lua. The README.md says that a conditional model is trained on a single static image. So, I created a job_list.txt containing a single line referencing a single jpeg file, and ran stabilize_videos_many.py, which extracted a single scene. I then ran main_conditional.lua against the extracted scene. This resulted in 100 iterations through the dataset, and then the program stopped, without saving a model. It looks like main_conditional.lua expects to run 1000 iterations through the dataset, but when I run main_conditional.lua against a single scene, it only run 100 iterations.

Is there something I should be doing differently? Should I run main_conditional.lua against several identical scenes to mimic a larger dataset? Should I simply set niter to a higher number, or saveIter to a lower one?

Thanks to you and your colleagues for sharing this great work!

Output size of generate.lua

I just ran generate.lua and I get a gif file, which size is 384 pixels and 1408 pixels, which mean that 6 gifs have been stacked horizontally and 22 gifs vertically. I checked the code, and nothing suggests to me, where these stacking is happening. What do I have to change, that the output is only a 64x64 image ?