Code Monkey home page Code Monkey logo

mono3d's Introduction

Mononizing Binocular Videos

ACM Transactions on Graphics (SIGGRAPH Asia 2020 issue), Vol. 39, No. 6, December 2020, pp. 228:1--228:16.

[ Project Webpage ] [ arXiv ] [ Video ]

Mono3D is the implementation of mono-nizing binocular videos into a regular monocular video with the stereo information implicitly encoded, such that the original binocular videos can be restored with high quality.

teaser

Online demo

[ Mononized view ] [ Restored left view ] [ Restored right view ]

Environment

Please refer to env.yaml.

  • Please carefully install following two packages with specific version, because the pretrained model is based on that version and newer versions are incompatible.
- mmcv==0.6.2
- mmdet==2.2.1 (build from source)

Dataset

We cannot release the whole 3D movie dataset due to copyright issues. But the binocular image dataset and part of the binocular video dataset used in the paper are publicly available: [ Flickr1024 ] and [ Inria ].

Prepare Flickr1024 for training the image version model

  1. Download Flickr1024 from the website: https://yingqianwang.github.io/Flickr1024/
  2. Download data list from https://drive.google.com/drive/folders/14oeXizbqTCxbmkZblt7YbWjaU2IIqNJf?usp=sharing
  3. Organise the dataset as following (${DATASET is the root dir for maintaining our dataset}):
${DATASET}  
|-- Flickr1024  
|   |-- Train  
|   |-- |-- 0001_L.png  
|   |   |-- 0001_R.png
|   |   |-- 0002_L.png  
|   |   |-- 0002_R.png
|   |   |-- ...
|   |-- Validation  
|   |-- |-- 0001_L.png  
|   |   |-- 0001_R.png
|   |   |-- 0002_L.png  
|   |   |-- 0002_R.png
|   |   |-- ...
|   |-- Test 
|   |-- |-- 0001_L.png  
|   |   |-- 0001_R.png
|   |   |-- 0002_L.png  
|   |   |-- 0002_R.png
|   |   |-- ...
|   |-- list  
|   |-- |-- train.txt
|   |   |-- val.txt
|   |   |-- test.txt  

Demo

$ PYTHONPATH=. python main/demo.py --left ./imgs/demo_L.png

Training

$ sh scripts/train.sh mono3d_img config/Flickr1024/mono3d_img.yaml

Evaluation

Evaluation on the testing set of Flickr1024

$ sh scripts/test.sh mono3d_img config/Flickr1024/mono3d_img.yaml

Copyright and License

You are granted with the LICENSE for both academic and commercial usages.

Acknowledgments

Thanks to Yingqian Wang for releasing the great dataset, Flickr1024.

Citation

@article{hu-2020-mononizing,
        author   = {Wenbo Hu and Menghan Xia and Chi-Wing Fu and Tien-Tsin Wong},
        title    = {Mononizing Binocular Videos},
        journal  = {ACM Transactions on Graphics (SIGGRAPH Asia 2020 issue)},
        month    = {December},
        year     = {2020},
        volume   = {39},
        number   = {6},
        pages    = {228:1-228:16}
    }

mono3d's People

Contributors

wbhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

mono3d's Issues

CNS module in image version model

Thanks for your valuable work!
I would like to know that whether the CNS module is used in the image version model? How to embed this module in the code? Could you offer a demo for this? Thanks for your time!

CNS Module

Would it be possible for you to describe how the CNS is implemented, specifically the macroblock jitter.

Training for Video

You mention that you pass P_t-1 to the reconstructor at time t similar to an RNN. Do you use the same architecture as an RNN, in that you have a learnable linear layer that multiplies P_t-1 and then sum the result with P_t and this is what is passed onto the reconstructor?

Mono3D Video Loss Function

You mentioned that you use a RNN version for stereo video input. However, how is the loss calculated. The formula provided is lambda_1 * MonoLoss + lamba_2* InverseLoss + StereoLoss. Are the losses added per time-step? Then loss.backward() is performed at the end of the batch?

Path Issue?

Whenever I try to run the demo
$ python main/demo.py --left ./imgs/demo_L.png

I get the error
No module named base.baseTrainer

It seems like the file cannot properly import the file inside base.

Also, I want to ask about the environment. In the README, it mentioned env.yaml, but I cannot seem to find the file.

Training code

In scripts/train.sh it seems to be running a train.py file, but I cannot find such file. Do you have any plans to release this file in the future?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.