Code Monkey home page Code Monkey logo

rpg_timelens's Introduction

TimeLens: Event-based Video Frame Interpolation

TimeLens

This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper TimeLens: Event-based Video Frame Interpolation by Stepan Tulyakov*, Daniel Gehrig*, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, and Davide Scaramuzza.

For more information, visit our project page.

Citation

A pdf of the paper is available here. If you use this dataset, please cite this publication as follows:

@Article{Tulyakov21CVPR,
  author        = {Stepan Tulyakov and Daniel Gehrig and Stamatios Georgoulis and Julius Erbach and Mathias Gehrig and Yuanyou Li and
                  Davide Scaramuzza},
  title         = {{TimeLens}: Event-based Video Frame Interpolation},
  journal       = "IEEE Conference on Computer Vision and Pattern Recognition",
  year          = 2021,
}

Google Colab

A Google Colab notebook is now available here. You can upsample your own video and events from you gdrive.

Gallery

For more examples, visit our project page.

coke paprika pouring water_bomb_floor

Installation

Install the dependencies with

cuda_version=10.2
conda create -y -n timelens python=3.7
conda activate timelens
conda install -y pytorch torchvision cudatoolkit=$cuda_version -c pytorch
conda install -y -c conda-forge opencv scipy tqdm click

Test TimeLens

First start by cloning this repo into a new folder

mkdir ~/timelens/
cd ~/timelens
git clone https://github.com/uzh-rpg/rpg_timelens

Then download the checkpoint and data to the repo

cd rpg_timelens
wget http://download.ifi.uzh.ch/rpg/web/data/timelens/data2/checkpoint.bin
wget http://download.ifi.uzh.ch/rpg/web/data/timelens/data2/example_github.zip
unzip example_github.zip 
rm -rf example_github.zip

Running Timelens

To run timelens simply call

skip=0
insert=7
python -m timelens.run_timelens checkpoint.bin example/events example/images example/output $skip $insert

This will generate the output in example/output. The first four variables are the checkpoint file, image folder and event folder and output folder respectively. The variables skip and insert determine the number of skipped vs. inserted frames, i.e. to generate a video with an 8 higher framerate, 7 frames need to be inserted, and 0 skipped.

The resulting images can be converted to a video with

ffmpeg -i example/output/%06d.png timelens.mp4

the resulting video is timelens.mp4.

Dataset

hsergb

Download the dataset from our project page. The dataset structure is as follows

.
├── close
│   └── test
│       ├── baloon_popping
│       │   ├── events_aligned
│       │   └── images_corrected
│       ├── candle
│       │   ├── events_aligned
│       │   └── images_corrected
│       ...
│
└── far
    └── test
        ├── bridge_lake_01
        │   ├── events_aligned
        │   └── images_corrected
        ├── bridge_lake_03
        │   ├── events_aligned
        │   └── images_corrected
        ...

Each events_aligned folder contains events files with template filename %06d.npz, and images_corrected contains image files with template filename %06d.png. In events_aligned each event file with index n contains events between images with index n-1 and n, i.e. event file 000001.npz contains events between images 000000.png and 000001.png. Each event file contains keys for the x,y,t, and p event component. Note that x and y need to be divided by 32 before use. This is because they actually correspond to remapped events, which have floating point coordinates.

Moreover, images_corrected also contains timestamp.txt where image timestamps are stored. Note that in some folders there are more image files than event files. However, the image stamps in timestamp.txt should match with the event files and the additional images can be ignored.

For a quick test download the dataset to a folder using the link sent by email.

wget download_link.zip -O /tmp/dataset.zip
unzip /tmp/dataset.zip -d hsergb/

And run the test

python test_loader.py --dataset_root hsergb/ \ 
                      --dataset_type close \ 
                      --sequence spinning_umbrella \ 
                      --sample_index 400

This should open a window visualizing aligned events with a single image.

rpg_timelens's People

Contributors

danielgehrig18 avatar etienne87 avatar magehrig avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rpg_timelens's Issues

Model is not downloading correctly

Description of Problem ❗️

I get a 403 message when I try to access the model

>>> !wget http://rpg.ifi.uzh.ch/timelens/data2/checkpoint.bin
HTTP request sent, awaiting response... 403 Forbidden

Expected Behavior ✅

model downloads from the web


Environment 🛠

Google Colab


Additional Comments 🌳

I was using the Google Colab notebook included in the repository, but I was not able to download the model. I'm apparently forbidden from downloading or accessing it.

Training split of HS-ERGB

Dear authors,
Thanks for publishing this great work! I would love to use the HS-ERGB and BS-ERGB datasets for my research, however I noticed that whereas BS-ERGB is more complete, the download link for HS-ERGB only contains the test split (both for the close and far sequences). The table in the paper however says that most of the videos are in the train split. Would it be possible by any chance to share these training videos as well?
Thank you very much in advance!

How to directly evaluate the code with the provided metrics?

Hello! Thank you for publishing your work, it's super interesting!

I'm trying to evaluate it using the script provided with the HS-ERGB dataset. As stated in the paper, the evaluation is by dropping some frames from the ground truth sequence, and evaluating the PSNR only on these (dropped) frames.

I was wondering if that is directly doable using this code (with some configuration of skip and insert), or I should modify it by generating several ground truth sequences (dropping some frames) and just use insert to replicate the ones in the paper? Also, if I drop these frames, should I drop the events as well?

Thank you!

Mismatch between timestamp.txt and number of images in HSERGB dataset?

Hello @danielgehrig18!

While I was trying to evaluate your code ( #35 ) I realized that the timestamps.txt for some sequences (see baloon_popping) contain less timestamps (284) that the amount of frames in the images_corrected folder (334). Can you confirm if this is the correct behavior?

Thank you!

Edit: I'm seeing now that it's due to the number of event files in the events_aligned folder. Shouldn't these be equal to the amount of frames - 1 anyway?

The event visualization of the hsergb dataset

Thanks to open source for this amazing work! I have two questions about the hsergb dataset.

python test_loader.py --dataset_root data/hsergb --dataset_type close --sequence spinning_umbrella --sample_index 400

image

python test_loader.py --dataset_root data/hsergb --dataset_type close --sequence fountain_schaffhauserplatz_02 --sample_index 420

image

From the above two examples, I found that some events are not aligned with the image, and some grid locations are missing events. Is this normal?
And why are the coordinates (x, y) in the event data decimal value rather than integer value? Is this the raw data collected by the event camera, or due to the interpolation during dual camera calibration?

Corrupt file

The result video can't be played because the file is corrupt or missing appropriate codecs.

How to find the network input tensor format of images and events files?

I am getting some artifacts in the output interpolated frames while using your evaluation code on some other data so I am trying to fine tune the network with that data. So I am finding it difficult to interpret the code as how I can find the format of the input tensor to the network corresponding to the images and event files that will be helpful for me to design the training code for fine tuning the network. Can you please help me out by pointing to the given code as how I can find the format of input tensors?

So I have input images in png format having shape: (h,w,3) and corresponding event data in mid of every two image frame is in .npz format which can be converted to form event frames of shape: (h,w,1). How I can proceed further to fine tune the network? Shall I fine tune each and every block of the architecture in the order mentioned or fine-tuning only the final attention_averaging block will be better?

video without events

Hello can you upsample any videos such as those that doesn't have events associated with them?

About simulation events

Hi ~

It's really a nice work but I have several questions. I found that in the paper you used the middlebury and vimeo90k dataset. However, different from adobe240 or GOPRO dataset, the middlebury and vimeo90k datasets are captured under 30fps. So how did you simulte the event for middlebury and vimeo90k datasets? Directly use the 30fps original videos?

Thanks a lot!

Qualitative comparision with synthetic events

Hi,

On Vimeo90K dataset, with synthetic events, this method has better quantitative results than DAIN, but can this method produce better visual results with synthetic events too? This Figure 7 results are based on "ground truth" events, so they can not solve my issue.

Thanks!

Difficulty in interpreting timelens video output

So I used your uploaded notebook TimeLens.ipynb for interpolating video frames. So I gave input 595 images in the format specified and was expecting output as interpolated image frames. I didn't understand how you calculated inserts using the formula mentioned as for my case, np.diff(timestamps).mean() is coming to be 10^7(why numerator for calculating fps has a value of 10^6?). Since I was expecting 10 inserts, I manually specified that value. Also I had done two sets of experiments where I have taken target_fps =600 and 1000

After running the timelens cell, I got as an output around 6545 image frames and two videos namely input.mp4(duration=200 sec) and interpolation.mp4(220 sec). After running the final cell using ffmpeg command, I also got output as a video: timelens.mp4(260 sec). Another doubt is for both value of target_fps I am getting the same videos as output and also number of interpolation frames were same. So for our case is it correct to say input fps=2.7(595/220) and output fps = 25(6545/260)?

I am finding difficult to interpret the output of the above mentioned 3 videos and 4th video that I can form after combining output images at my custom fps. Say if I am looking for generating 600 fps video, I will get a video of duration 10 sec from output video.
Please help me out to interpret the above mentioned 4 videos of the output too.

Testing with custom RGB data

Hello, is it possible to use custom frames as input?

How are the npz files generated? Or are they not required for inference?

Index Error: List index out of range while reading input images and event frames.

I am trying to run the TimeLens.ipynb file where following are the description of input directories:

  1. FRAME_INPUT_DIR: Folder having input images(.png) indexed from 0 to 595 and a file called timestamp.txt having timestamp information of each image.

  2. EVENTS_INPUT_DIR : Folder having event frames(in .npy format) indexed from 0 to 594. Between image 0.png and 1.png events information is combined in the frame 0.npy and so on.

But when I run the timelens from given TimeLens.ipynb file, I got the error IndexError: list index out of range as shown below:

Run_timelens_Error

Perhaps I have made mistake while making event frame directory and giving it as an input. Can you please help me out as how to prepare or what modifications is needed in Event frame directory/Image frame directory to rectify this error?

Different results between paper and code test

Hey,
It's a pretty work. I also conduct some comparison experiments in video frame interpolation. But I have several questions in some details.
I've been trying to run this code and to output results in timelens.
However, in terms of PSNR and SSIM, values of metrics from timelens code are different that from your paper 'Time Lens'. I don't know what's wrong.

Some details are provided as follow,
Dataset: HSERGB, BS-ERGB
Test code: uzh-rpg/rpg_timelens
Evaluation code: rpg_event_based_frame_interpolation_evaluation

Like your paper 'Time Lens', I report PSNR and SSIM for all sequences by skipping 1, 5 and 7 frames respectively, and reconstructing the missing frames.

My results (mean+std):

  1. skip 5 frames:
PSNR/SSIM HSERGB(far) HSERGB(close)
code(timelens) 31.33±2.55/0.883±0.069 31.81±4.20/0.822±0.108
paper(timelens) 33.13±2.10/0.877±0.092 32.19±4.19/0.839±0.090
  1. skip 7 frames:
PSNR/SSIM HSERGB(far) HSERGB(close)
code(timelens) 30.05±2.24/0.864±0.065 31.54±6.05/0.844±0.120
paper(timelens) 32.31±2.27/0.869±0.110 31.68±4.18/0.835±0.091
  1. skip 1 frame:
PSNR/SSIM BS-ERGB
code(timelens) 24.03±4.30/0.741±0.153
paper(timelens) 28.56/-

In addition, I noticed and solved this issue for HSERGB dataset, as you mentioned before.
"""
In events_aligned each event file with index n contains events between images with index n-1 and n, i.e. event file 000001.npz contains events between images 000000.png and 000001.png.
"""
So, I deleted event file '000000.npz' of each sequence to make sure that all steps are correct for HSERGB dataset. BS-ERGB dataset are still unchanged.
Other additional things are not done. If I have any mistakes please correct me.
What confuses me is that the results obtained using the code are different from the results of the paper.
Looking forward to your reply.

'torch.jit' has no attribute 'unused'

I followed all the steps written in the README. But I get the following error when I try to run the code on the example sequence: AttributeError: 'torch.jit' has no attribute 'unused'

Issue

Retorna um vídeo de mp4 que não roda.

question about training data

The paper mentioned "We finetune our network on real data module-by-module in the order of training" in section 3, so can the real data for fine-tuning be released for our to train the model to reproduce the results.

suggested train/val/test split

Sorry if this is mentioned in the paper,

Is there a suggestion of train/val/test split between the "far" and "close" sections for training on this dataset?

Help With Evaluation

Sorry for the beginner question.

I've been trying to evaluate this code. It works perfectly with the provided video and events but I am facing issues with my own videos.

  1. I extracted the video to frames.
  2. I created a timestamp.txt with the correct time stamps.
  3. I converted the pngs to npz using a script I found online.

But still the evaluation fails to go through. It actually runs without errors but it does not generate any new content. Am I missing something? Or is there a particular way to generate events?

ValueError: zero-dimensional arrays cannot be concatenated

(timelens) C:\Users\kurs\Documents\rpg_timelens>python -m timelens.run_timelens checkpoint.bin C:\Users\kurs\Desktop\rpg_events C:\Users\kurs\Desktop\rpg_upsampled_1 C:\Users\kurs\Desktop\output
Processing .
100%|█████████████████████████████████████████████████████████████████████████████| 1392/1392 [00:02<00:00, 492.17it/s]
Traceback (most recent call last):
  File "C:\Users\kurs\anaconda3\envs\timelens\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\kurs\anaconda3\envs\timelens\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 176, in <module>
    main()
  File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 170, in main
    number_of_frames_to_insert,
  File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 120, in run_recursively
    leaf_event_folder, leaf_image_folder, "*.npz", "*.png"
  File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\hybrid_storage.py", line 81, in from_folders
    event_file_template=event_file_template
  File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\event.py", line 424, in from_folder
    return cls.from_npz_files(filenames, image_height, image_width)
  File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\event.py", line 440, in from_npz_files
    features = np.concatenate(features_list)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: zero-dimensional arrays cannot be concatenated 

How can i fix this error?

pytorch version

Thank you for sharing this project. I'd like to know the pytorch version you operate.

random access logic

Hi, thanks for this great dataset!
I would like to code a random access function getitem that, given index i, num_skip s, returns the necessary data to train.
Can i assume that:

  • input is image_png[i], image_png[i+s], event_npz[i], timestamps[i], timestamps[i+s]
  • output is, given chosen intermediate time tau in [i, i+s]: image_png[tau].

When i try this i sometimes get events that begin before timestamps[i]

IndexError: list index out of range

Hey,

im having an issue using python -m timelens.run_timelens the error i get is:

Traceback (most recent call last): File "C:\Users\kurs\anaconda3\envs\timelens\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\kurs\anaconda3\envs\timelens\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 176, in <module> main() File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1137, in __call__ return self.main(*args, **kwargs) File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1062, in main rv = self.invoke(ctx) File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 763, in invoke return __callback(*args, **kwargs) File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 170, in main number_of_frames_to_insert, File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 120, in run_recursively leaf_event_folder, leaf_image_folder, "*.npz", "*.png" File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\hybrid_storage.py", line 81, in from_folders event_file_template=event_file_template File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\event.py", line 424, in from_folder return cls.from_npz_files(filenames, image_height, image_width) File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\event.py", line 442, in from_npz_files features = load_events(list_of_filenames[0]) IndexError: list index out of range

The upsampling script from vid2e generated 384.png files from 00000000.png to 00000384.png while the event generator gave me 383 files named 00000000.npz - 00000383.npz and the timestamp.txt file.
Can you please help solving the issue? I tried removing the last png file so the number of files matches, but it didn't help.
How am i supposed to add the missing .npz file as stated in Issue #19 ?

Events files

You wrote:
A Google Colab notebook is now available here. You can upsample your own video and events from you gdrive.
How to get events files and timestamp.txt for my own video?

Request to add a license file

Great project, thanks!

If possible, please add an open-source license file to the project to indicate your intended terms of use. i.e., MIT, Berkeley, Apache 2, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.