dachunkai / evtexture Goto Github PK

View Code? Open in Web Editor NEW

920.0 14.0 59.0 16.46 MB

[ICML 2024] EvTexture: Event-driven Texture Enhancement for Video Super-Resolution

Home Page: https://dachunkai.github.io/evtexture.github.io/

License: Apache License 2.0

Python 66.88% C++ 7.94% Cuda 11.83% Dockerfile 0.55% Shell 0.15% Jupyter Notebook 12.64%

event-camera video-restoration video-super-resolution

evtexture's People

Contributors

Stargazers

Watchers

Forkers

faisalshahbaz piterskiy camenduru paperwave hadryan utopic-dev riverpoolnode likethis85 rouai zzmjohn dilshodbekmx templeblock georgiydemo anthonyyuan cabelo abdoiiii gary109 jmaigc drunkingsaillor1 balcony321 dc777 norkts peterjia sheshenin christ1209 thomascherickal sorokinvld julianna94 eltociear liulangmagong nullbio timdesrochers jawond zioncity karthikra suryatmodulus yaospacetim xialaup kekewind sagarkore2005 moluyouwo wsk3373 zsxkib devbox10 m4rm0k wangguijie arunbanswal ip-superresolution layedd dstansice changxuan-fan marenan martjay

evtexture's Issues

Requirements error

Requirement already satisfied: setuptools in c:\users\miniconda3\envs\evtexture\lib\site-packages (from -r requirements.txt (line 17)) (65.6.3)
ERROR: Ignored the following versions that require a different python version: 0.20.0 Requires-Python >=3.8; 0.20.0rc2 Requires-Python >=3.8; 0.20.0rc3 Requires-Python >=3.8; 0.20.0rc4 Requires-Python >=3.8; 0.20.0rc5 Requires-Python >=3.8; 0.20.0rc6 Requires-Python >=3.8; 0.20.0rc7 Requires-Python >=3.8; 0.20.0rc8 Requires-Python >=3.8; 0.21.0 Requires-Python >=3.8; 0.21.0rc0 Requires-Python >=3.8; 0.21.0rc1 Requires-Python >=3.8; 0.22.0 Requires-Python >=3.9; 0.22.0rc1 Requires-Python >=3.9; 0.23.0 Requires-Python >=3.10; 0.23.0rc0 Requires-Python >=3.10; 0.23.0rc2 Requires-Python >=3.10; 0.23.1 Requires-Python >=3.10; 0.23.2 Requires-Python >=3.10; 0.23.2rc1 Requires-Python >=3.10; 0.24.0 Requires-Python >=3.9; 0.24.0rc1 Requires-Python >=3.9; 0.6.2rc0 Requires-Python >=3.8; 0.7.0 Requires-Python >=3.8; 0.7.0rc1 Requires-Python >=3.8; 0.7.0rc2 Requires-Python >=3.8; 0.8.0 Requires-Python >=3.8; 1.10.0 Requires-Python <3.12,>=3.8; 1.10.0rc1 Requires-Python <3.12,>=3.8; 1.10.0rc2 Requires-Python <3.12,>=3.8; 1.10.1 Requires-Python <3.12,>=3.8; 1.11.0 Requires-Python <3.13,>=3.9; 1.11.0rc1 Requires-Python <3.13,>=3.9; 1.11.0rc2 Requires-Python <3.13,>=3.9; 1.11.1 Requires-Python <3.13,>=3.9; 1.11.2 Requires-Python <3.13,>=3.9; 1.11.3 Requires-Python <3.13,>=3.9; 1.11.4 Requires-Python >=3.9; 1.12.0 Requires-Python >=3.9; 1.12.0rc1 Requires-Python >=3.9; 1.12.0rc2 Requires-Python >=3.9; 1.13.0 Requires-Python >=3.9; 1.13.0rc1 Requires-Python >=3.9; 1.13.1 Requires-Python >=3.9; 1.14.0 Requires-Python >=3.10; 1.14.0rc1 Requires-Python >=3.10; 1.14.0rc2 Requires-Python >=3.10; 1.4.0 Requires-Python >=3.8; 1.4.0rc0 Requires-Python >=3.8; 1.4.1 Requires-Python >=3.8; 1.4.2 Requires-Python >=3.8; 1.4.3 Requires-Python >=3.8; 1.4.4 Requires-Python >=3.8; 1.5.0 Requires-Python >=3.8; 1.5.0rc0 Requires-Python >=3.8; 1.5.1 Requires-Python >=3.8; 1.5.2 Requires-Python >=3.8; 1.5.3 Requires-Python >=3.8; 1.8.0 Requires-Python >=3.8,<3.11; 1.8.0rc1 Requires-Python >=3.8,<3.11; 1.8.0rc2 Requires-Python >=3.8,<3.11; 1.8.0rc3 Requires-Python >=3.8,<3.11; 1.8.0rc4 Requires-Python >=3.8,<3.11; 1.8.1 Requires-Python >=3.8,<3.11; 1.9.0 Requires-Python >=3.8,<3.12; 1.9.0rc1 Requires-Python >=3.8,<3.12; 1.9.0rc2 Requires-Python >=3.8,<3.12; 1.9.0rc3 Requires-Python >=3.8,<3.12; 1.9.1 Requires-Python >=3.8,<3.12; 1.9.2 Requires-Python >=3.8; 1.9.3 Requires-Python >=3.8; 2.0.0 Requires-Python >=3.8; 2.0.0rc0 Requires-Python >=3.8; 2.0.0rc1 Requires-Python >=3.8; 2.0.1 Requires-Python >=3.8; 2.0.2 Requires-Python >=3.8; 2.0.3 Requires-Python >=3.8; 2.1.0 Requires-Python >=3.9; 2.1.0rc0 Requires-Python >=3.9; 2.1.1 Requires-Python >=3.9; 2.1.2 Requires-Python >=3.9; 2.1.3 Requires-Python >=3.9; 2.1.4 Requires-Python >=3.9; 2.2.0 Requires-Python >=3.9; 2.2.0rc0 Requires-Python >=3.9; 2.2.1 Requires-Python >=3.9; 2.2.2 Requires-Python >=3.9; 2.32.0 Requires-Python >=3.8; 2.32.1 Requires-Python >=3.8; 2.32.2 Requires-Python >=3.8; 2.32.3 Requires-Python >=3.8; 3.10.0 Requires-Python >=3.8; 3.11.0 Requires-Python >=3.8; 3.6.0 Requires-Python >=3.8; 3.6.0rc1 Requires-Python >=3.8; 3.6.0rc2 Requires-Python >=3.8; 3.6.1 Requires-Python >=3.8; 3.6.2 Requires-Python >=3.8; 3.6.3 Requires-Python >=3.8; 3.7.0 Requires-Python >=3.8; 3.7.0rc1 Requires-Python >=3.8; 3.7.1 Requires-Python >=3.8; 3.7.2 Requires-Python >=3.8; 3.7.3 Requires-Python >=3.8; 3.7.4 Requires-Python >=3.8; 3.7.5 Requires-Python >=3.8; 3.8.0 Requires-Python >=3.9; 3.8.0rc1 Requires-Python >=3.9; 3.8.1 Requires-Python >=3.9; 3.8.2 Requires-Python >=3.9; 3.8.3 Requires-Python >=3.9; 3.8.4 Requires-Python >=3.9; 3.9.0 Requires-Python >=3.8; 3.9.0 Requires-Python >=3.9; 3.9.0rc2 Requires-Python >=3.9; 3.9.1 Requires-Python >=3.9
ERROR: Could not find a version that satisfies the requirement tb-nightly (from versions: none)
ERROR: No matching distribution found for tb-nightly

Convert events into voxels

Thank you for sharing the data preparation details. I have created events for an image dataset, and the events.h5 file looks as follows:

events/ps/ : (7013772,) bool
events/ts/ : (7013772,) float64
events/xs/ : (7013772,) int16
events/ys/ : (7013772,) int16

Could you please share the code snippet that can convert these events into voxels in the format specified below?
voxels_b/000000/ : (5, 180, 320) float64
voxels_b/000001/ : (5, 180, 320) float64
voxels_b/000002/ : (5, 180, 320) float64
voxels_b/000003/ : (5, 180, 320) float64
voxels_b/000004/ : (5, 180, 320) float64

I tried using [events_contrast_maximization] but ended up generating voxels in a different format, which is incorrect. My dataset contains 26 images, but the voxel file only has 5 lines, and the tensor shape [bins, H, W] is also incorrect.

voxels_f/000000/ : (720, 1280) float64
voxels_f/000001/ : (720, 1280) float64
voxels_f/000002/ : (720, 1280) float64
voxels_f/000003/ : (720, 1280) float64
voxels_f/000004/ : (720, 1280) float64

The generated voxel file does not include all 26 images. Please assist. Thank you.

import random
import esim_py
import os
import matplotlib.pyplot as plt
import numpy as np
import torch

from abc import ABCMeta, abstractmethod
import h5py
import cv2 as cv
import numpy as np
from utils.event_utils import events_to_voxel_torch


image_folder = os.path.join(os.path.dirname(__file__), "data/frames/")
timestamps_file = os.path.join(os.path.dirname(__file__), "data/timestamps.txt")

config = {
    'refractory_period': 1e-4,
    'CT_range': [0.05, 0.5],
    'max_CT': 0.5,
    'min_CT': 0.02,
    'mu': 1,
    'sigma': 0.1,
	'H': 720,
	'W': 1280,
    'log_eps': 1e-3,
    'use_log': True,
}

Cp = random.uniform(config['CT_range'][0], config['CT_range'][1])
Cn = random.gauss(config['mu'], config['sigma']) * Cp
Cp = min(max(Cp, config['min_CT']), config['max_CT'])
Cn = min(max(Cn, config['min_CT']), config['max_CT'])
esim = esim_py.EventSimulator(Cp,
                            Cn,
                            config['refractory_period'],
                            config['log_eps'],
                            config['use_log'])

events = esim.generateFromFolder(image_folder, timestamps_file) # Generate events with shape [N, 4]

xs = torch.tensor(events[:, 0], dtype=torch.int16)
ys = torch.tensor(events[:, 1], dtype=torch.int16)
ts = torch.tensor(events[:, 2], dtype=torch.float32)  # Use float32 for consistency
ps = torch.tensor(events[:, 3], dtype=torch.bool)  


voxel_f = events_to_voxel_torch(xs, ys, ts, ps, 5, device=None, sensor_size=(720, 1280), temporal_bilinear=True)


output_h5_file = "voxel_fx.h5"  # Replace with desired output path
B = 5  # Number of bins in voxel grids
n = 1000  # Number of events per voxel (not sure what to set)
temporal_bilinear = True 

with h5py.File(output_h5_file, 'w') as output_file:
    # Save each voxel grid to HDF5 file with the specified format
    for i, voxel in enumerate(voxel_f):
        print(list(voxel.shape))
        voxel_name = f'voxels_f/{i:06d}'
        output_file.create_dataset(voxel_name, data=voxel.numpy().astype(np.float64))  # Convert to float64

    # Add metadata or attributes if needed
    output_file.attrs['num_voxels'] = len(voxel_f)
    output_file.attrs['B'] = B
    output_file.attrs['n'] = n

    print(f"Voxel data saved to {output_h5_file}")

How do I process my video into HDF5 files

How to Generate Event Data for Demos Using Traditional Frame-Captured Footage?

Description
The model code assumes that event video data is included as part of the inputs. However, it's unclear how this was managed for the demonstration purposes, especially since the test datasets like Vid4 and REDS seem to be captured using traditional frame-based cameras.

Could you please provide details on the following:

Generation of Event Data: How was event data generated or simulated for the demonstration? Was there a specific method or tool used to convert traditional frame-based footage into event-based data?

Implementation Details: Any specific scripts or code examples used to achieve this conversion would be highly appreciated. Understanding the methodology would help in replicating the demo setup accurately.

Test Data Adaptation: If the event data was simulated, what adjustments or preprocessing steps were necessary to align this data with the model's requirements?

Video Demos

Vid4_City.mp4

Vid4_Foliage.mp4

REDS_000.mp4

REDS_011.mp4

VRAM Requirement?

Hi how does this compare to video2x? And what is the VRAM/GPU req for running the models? thanks!

Questions about Event Data

Thanks for your wanderful work!
I have some questions about the event data:

Source of Event Data for Training: Is the event data used for training captured directly from physical event cameras, or is it generated synthetically from conventional video footage?
Inference with Specific Videos: If I want to test the model on a specific video, does it require conversion of this video into event data using a particular model?
Physical Method for Event Generation: Is it feasible to generate event data directly through a physical method, such as calculating the change in pixel intensity between frames of a conventional video? Would this approach be valid?

I appreciate your guidance on these queries as I am looking to better understand the processes involved in working with event data.
Thanks again!
PS: Is there any WeChat group or Discord channel where I can discuss these topics further?

Simplify script without needing Docker

I am trying to get this working under Windows without needing Docker. This should be possible right? Docker just complicates the setup. It should be create an environment, pip install the requirements, download models and go.

At this point I have a batch file to do all the setup and download the required models. This will get it all downloaded and setup on a Windows PC. Save this as install.bat into an empty directory and run it. For the downloads you also need wget.exe, 7z.ese and 7z.dll in the same directory.

@echo off

cd
echo *** Deleting EvTexture directory if it exists
if exist EvTexture\. rd /S /Q EvTexture

echo *** Cloning EvTexture repository
git clone https://github.com/DachunKai/EvTexture
cd EvTexture

echo *** Creating venv
python -m venv venv
call venv\scripts\activate.bat

echo *** Installing EvTexture requirements
python -m pip install --upgrade pip
pip install -r requirements.txt

echo *** Patching xformers
pip uninstall -y xformers
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts xformers==0.0.25 --index-url https://download.pytorch.org/whl/cu118

echo *** Installing GPU torch
pip uninstall -y torch
pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==2.3.1+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

echo *** python setup.py develop
python setup.py develop

echo *** Downloading models
md experiments
md experiments\pretrained_models
md experiments\pretrained_models\EvTexture
..\wget "https://github.com/DachunKai/EvTexture/releases/download/v0.0/EvTexture_REDS_BIx4.pth" -O "experiments\pretrained_models\EvTexture\EvTexture_REDS_BIx4.pth" -nc
..\wget "https://github.com/DachunKai/EvTexture/releases/download/v0.0/EvTexture_Vimeo90K_BIx4.pth" -O "experiments\pretrained_models\EvTexture\EvTexture_Vimeo90K_BIx4.pth" -nc

echo *** Downloading test sets
..\wget "https://github.com/DachunKai/EvTexture/releases/download/v0.0/test_sets_REDS4_h5.zip" -O "datasets\test_sets_REDS4_h5.zip" -nc
..\wget "https://github.com/DachunKai/EvTexture/releases/download/v0.0/test_sets_Vid4_h5.zip" -O "datasets\test_sets_Vid4_h5.zip" -nc

echo *** Extracting test sets
cd datasets
..\..\7z x test_sets_REDS4_h5.zip
del test_sets_REDS4_h5.zip
..\..\7z x test_sets_Vid4_h5.zip
del test_sets_Vid4_h5.zip
cd..

call venv\scripts\deactivate.bat
cd..

echo *** Finished EvTexture install
echo.
echo *** Scroll up and check for errors.  Do not assume it worked.
pause

Then to run the examples, once the venv is activated I run either
python basicsr\test.py -opt options/test/EvTexture/test_EvTexture_Vid4_BIx4.yml --launcher pytorch
or
python basicsr\test.py -opt options/test/EvTexture/test_EvTexture_REDS4_BIx4.yml --launcher pytorch

At that point I get the following error.

  File "D:\Tests\EvTexture\EvTexture\basicsr\test.py", line 48, in <module>
    test_pipeline(root_path)
  File "D:\Tests\EvTexture\EvTexture\basicsr\test.py", line 16, in test_pipeline
    opt, _ = parse_options(root_path, is_train=False)
  File "d:\tests\evtexture\evtexture\basicsr\utils\options.py", line 122, in parse_options
    init_dist(args.launcher)
  File "d:\tests\evtexture\evtexture\basicsr\utils\dist_util.py", line 14, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "d:\tests\evtexture\evtexture\basicsr\utils\dist_util.py", line 22, in _init_dist_pytorch
    rank = int(os.environ['RANK'])
  File "D:\Python\lib\os.py", line 680, in __getitem__
    raise KeyError(key) from None
KeyError: 'RANK'

Is it possible to get a simpler example with a script like
python upscale_video.py --source myvideo.mp4 --output upscaled.mp4 --other --options --here
so it is easier to get working without needing Docker?

Thanks for any help.

What is the resolution this can upscale?

Before spending my time on this to make it installed and run can you tell me? What is the resolution?

Like can we upscale 512x512 video to 1024x1024 or 2048x2048?

thank you

Detailed instructions on how to upscale videos in a standard format are needed.

With all due respect to the author, dataset preparation is overly complicated even for someone who has been using computers for decades like myself. Is it really that complex? Could someone please create a detailed tutorial? I would really love to try this seemingly phenomenal open-source project, but not everything is explained properly. Really.
Thank you to everyone who can help!

CUDA problem

When running basicsr/test.py for EvTexture on a macOS system without GPU support, the script fails due to CUDA-related operations despite setting use_gpu: False.

Error Message:

AssertionError: Torch not compiled with CUDA enabled

Details:

The script is trying to execute CUDA functions (torch.zeros in dist_validation method).
The model and dataset are set up correctly.
The test process fails when calling dist_validation.

Steps to Reproduce:

Configure basicsr/test.py to use CPU (use_gpu: False).
Execute the script on macOS.

Expected Behavior: The script should run without attempting to initialize CUDA.

Actual Behavior: The script fails with a CUDA-related error.

Potential Solution: Modify the dist_validation method to ensure all operations are explicitly set to run on the CPU.

Code Snippet Causing Issue:

self.metric_results[folder] = torch.zeros(size, dtype=torch.float32, device=torch.device('cpu'))

Request for Assistance

Please provide guidance on ensuring the dist_validation method and related functions do not attempt to use CUDA when use_gpu: False.

DataPreparation understanding

Hi, first off, congratulations on your work. The use of event data for this is fascinating.

I'm trying to reproduce your work using simulated data and I have a couple of question

What is the FPS you recommend for frame interpolation? As you use a B=5, would it be a safe bet to assume 5 extra frames, between every 2 frames?
Am I understanding this correct? for calendar.h5, data looks like this:

dict_keys(['images/000000', 'images/000001', 'images/000002', 'images/000003', 'images/000004', 'images/000005', 'images/000006', 'images/000007', 'images/000008', 'images/000009', 'images/000010', 'images/000011', 'images/000012', 'images/000013', 'images/000014', 'images/000015', 'images/000016', 'images/000017', 'images/000018', 'images/000019', 'images/000020', 'images/000021', 'images/000022', 'images/000023', 'images/000024', 'images/000025', 'images/000026', 'images/000027', 'images/000028', 'images/000029', 'images/000030', 'images/000031', 'images/000032', 'images/000033', 'images/000034', 'images/000035', 'images/000036', 'images/000037', 'images/000038', 'images/000039', 'images/000040', 'voxels_b/000000', 'voxels_b/000001', 'voxels_b/000002', 'voxels_b/000003', 'voxels_b/000004', 'voxels_b/000005', 'voxels_b/000006', 'voxels_b/000007', 'voxels_b/000008', 'voxels_b/000009', 'voxels_b/000010', 'voxels_b/000011', 'voxels_b/000012', 'voxels_b/000013', 'voxels_b/000014', 'voxels_b/000015', 'voxels_b/000016', 'voxels_b/000017', 'voxels_b/000018', 'voxels_b/000019', 'voxels_b/000020', 'voxels_b/000021', 'voxels_b/000022', 'voxels_b/000023', 'voxels_b/000024', 'voxels_b/000025', 'voxels_b/000026', 'voxels_b/000027', 'voxels_b/000028', 'voxels_b/000029', 'voxels_b/000030', 'voxels_b/000031', 'voxels_b/000032', 'voxels_b/000033', 'voxels_b/000034', 'voxels_b/000035', 'voxels_b/000036', 'voxels_b/000037', 'voxels_b/000038', 'voxels_b/000039', 'voxels_f/000000', 'voxels_f/000001', 'voxels_f/000002', 'voxels_f/000003', 'voxels_f/000004', 'voxels_f/000005', 'voxels_f/000006', 'voxels_f/000007', 'voxels_f/000008', 'voxels_f/000009', 'voxels_f/000010', 'voxels_f/000011', 'voxels_f/000012', 'voxels_f/000013', 'voxels_f/000014', 'voxels_f/000015', 'voxels_f/000016', 'voxels_f/000017', 'voxels_f/000018', 'voxels_f/000019', 'voxels_f/000020', 'voxels_f/000021', 'voxels_f/000022', 'voxels_f/000023', 'voxels_f/000024', 'voxels_f/000025', 'voxels_f/000026', 'voxels_f/000027', 'voxels_f/000028', 'voxels_f/000029', 'voxels_f/000030', 'voxels_f/000031', 'voxels_f/000032', 'voxels_f/000033', 'voxels_f/000034', 'voxels_f/000035', 'voxels_f/000036', 'voxels_f/000037', 'voxels_f/000038', 'voxels_f/000039'])

where the shape of each individual voxel tensor is [B, H, W].

So in order to replicate this, should I use events_to_voxel_torch on the Event data of each real frame, individually? (When I say real frame, I mean, the original frame plus all the interpolated frames between t and t+1)

Also, 'images/000000' looks to just be the LR image, saved as h5. Would it be enough to open the image as a np array and save it without any other processing? Maybe I missed this detail, but I don't see any processing involved in this DataPreparation step, for the LR images

Thank you

Host models on HF with a demo?

Dear team,

Great super-resolution work! It would be awesome if the model is hosted on the Hugging Face and it will increase the impact of your work too.

There are several advantages of hosting the models on HF compared to Google Drive/Baidu disk.

Programmable interface for downloading weights
Global CDN (China is accessible via hf-mirror.com)
Increased visibility and discoveribility

We also provide free A100 GPU for setting up demos via ZeroGPU: https://www.theverge.com/2024/5/16/24156755/hugging-face-celement-delangue-free-shared-gpus-ai

Let me know how you think and happy to invite you to our Slack channel for further discussion / assistant!

Thanks,
Tiezhen