Code Monkey home page Code Monkey logo

cain's Introduction

cain's People

Contributors

myungsub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cain's Issues

Question about the Figure 5 (CA visualization) in paper

Hi, I'm interested about the Figure 5, Visualization of internal feature maps with their channel attentions, in your paper.

Could you please tell me more details about how you calculate the activation maps for channels? Or pubilish the relevant code about the activation map calculation.

Thanks

Add License

This is a great interpolation method.
Can you please add a License to the project?

Question about InOutPaddings

Hi, thanks for your nice work!

I have a question about the def InOutPaddings(x), why should the width and height be padded to a multiple of 128?
I change this number to 32, and the PSNR on Vimeo90k can be improved to 34.76dB (since 256 and 448 are already multiples of 32, no padding is actually done).

def InOutPaddings(x):
    w, h = x.size(3), x.size(2)
    padding_width, padding_height = 0, 0
    if w != ((w >> 5) << 5):
        padding_width = (((w >> 5) + 1) << 5) - w
    if h != ((h >> 5) << 5):
        padding_height = (((h >> 5) + 1) << 5) - h
    paddingInput = nn.ReflectionPad2d(padding=[padding_width // 2, padding_width - padding_width // 2,
                                               padding_height // 2, padding_height - padding_height // 2])
    paddingOutput = nn.ReflectionPad2d(padding=[0 - padding_width // 2, padding_width // 2 - padding_width,
                                                0 - padding_height // 2, padding_height // 2 - padding_height])
    return paddingInput, paddingOutput

Interpolation error

When I tried to test the program with test_custom.sh I get this error:

Namespace(batch_size=64, beta1=0.9, beta2=0.99, cuda=True, data_dir='data', data_root='data/frame_seq', dataset='custom', depth=3, exp_name='CAIN_fin', fix_loaded=False, img_fmt='jpg', log_dir='logs', log_iter=20, loss='1*L1', lpips=False, lr=0.0001, max_epoch=200, mode='test', model='cain', n_resblocks=12, num_frames=3, num_gpu=1, num_workers=5, random_seed=12345, resume=True, resume_exp=None, start_epoch=0, test_batch_size=8, test_mode='hard', up_mode='shuffle', use_tensorboard=False, val_batch_size=4, viz=False)
Building model: CAIN

of parameters: 42780432

Evaluating for epoch = 175
[0] images ready to be loaded
0it [00:00, ?it/s]
Traceback (most recent call last):
File "generate.py", line 124, in
main(args)
File "generate.py", line 120, in main
test(args, args.start_epoch)
File "generate.py", line 108, in test
print('im_processed: {:d}/{:d} {:.3f}s \r'.format(i + 1, len(test_loader), time.time() - t))
UnboundLocalError: local variable 'i' referenced before assignment

I have all the packages installed with their respective versions.

question about Middlebury dataset

Hi๏ผŒ how did you get the result of Middlebury dataset? I have got the same results of your pertrained model on Vimeo90K and UCF101 datasets, but got a poor result on Middlebury. Could you please provide some details?

About Pixel Shuffle

It is very interesting that you use Pixel Shuffle and Channel Attention for motion estimation without estimating optical flow.

I want to ask that in the paper you said that using Pixel Shuffle to maintain the large receptive field, so I want to ask how PS can do that.

One more question, in VFI, I usually see that people will use again the input images to reconstruct the color for the middle. So how just by applying Up Shuffle you can synthesize the middle frame?

Thank you.

Feature map attention score

Hi, thanks for your work, it is pretty interesting. I have a question about the feature maps in figure 5. I know to obtain the feature maps you'd use hooks, but how do you manage to obtain the attention scores for the feature maps?
Thanks again

Why the picture is still blurry after training 20 epoch

I want to replicate your algorithm with the Vimeo90K dataset, but why I trained 20 EPOCH of PSNR to only 15 is almost the same as the PSNR trained by the first EPOCH๏ผŒI see that the maximum epoch setting on the source code is 200, and the rest of the parameters have not been modified, can it be similar to your results after training 200 epoch? I would appreciate it if you could answer it.

how to interpolate a frame at an arbitray time

Hey, buddy, I like your model so much after I tried some video samples.
This is the STATE-OF-THE-ART model and an amaaaaazing work!!!! And, you are genious, buddy.
Recently, I was wondering how to interpolate a frame at an arbitray time like t=0.2. Unlike optical flow methods, kernel-based methods are only able to interpolate a single frame at t=0.5(t=0.25, t=0.75......).
Do you think is it possible to take a temporal variable t into the model and train it? I am looking forward to your answer.

how to convert the model to onnx๏ผŸ

when i want to convert the model to the onnx , but I canโ€˜โ€™t find the input and output name of the model which needed by torch.onnx.export(), can you help me? how to find the input and the output name? thanks !

Runtime error on test_custom.sh

I download the model, upload 2 images to data/frame_seq (I use Colab) and run test_custom.sh, but get the following error. What am I doing wrong? Could you help?
RuntimeError: Given groups=1, weight of size [192, 384, 3, 3], expected input[1, 128, 144, 240] to have 384 channels, but got 128 channels instead
image

SNU-FILM Dadasets download?

According to the data download link provided, why can't the download succeed? The compressed package is damaged. After repair, the data set is incomplete. Can you provide a new download link? Thanks

loss exploded

Hey, buddy, it is a amazing work! I was training the model on Vimeo90k dataset by running ./run.sh.
The loss gradually declined as the epoches increased. But after about 10 epoches passed, the loss exploded suddenly without any sign.
It printed like this
image

by

if loss.data.item() > 10.0 * LOSS_0:
print(max(p.grad.data.abs().max() for p in model.parameters()))
continue

And the generated test image
image

Why the loss exploded suddenly? How can I avoid it?

Thx!

About exploding gradient

Hello, the code has exploding gradient problem in the vimeo90k dataset trained to 10epoch. If I start retraining from model_best.pth it will result in CUDA out of memory.
Q1. I have set both batch_size and test_batch_size to 8. Will this affect the final result?
Q2. Is there any other solution?
I would be grateful if you could reply!

Training issue on Vimeo Dataset

Hi, thanks for sharing the code. I tried to train on Vimeo dataset from scratch just use your script but seems it stucks at PSNR around 15 and SSIM around 0.5. Is there any special recipe that I am mising in the training process?

I don't know where the output is generated.

Hi, @myungsub

๋‹˜์˜ ์ฝ”๋“œ๋ฅผ ๊ฐ€์ง€๊ณ  ํ”„๋ ˆ์ž„ ์ธํ„ฐํด๋ ˆ์ด์…˜์„ ํ•ด๋ณด์•˜๋Š”๋ฐ,
์ฝ”๋“œ๋Š” GPU๋ฅผ ์จ๊ฐ€๋ฉฐ ์ž˜ ๋Œ์•„๊ฐ€๋Š”๋ฐ
๊ฒฐ๊ณผ๋ฌผ์ด ์–ด๋””์— ์ƒ์„ฑ๋˜๋Š”์ง€๋ฅผ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.
๊ฒฐ๊ณผ๋ฌผ์€ ๋‹จ์ผ ๋น„๋””์˜ค ํŒŒ์ผ์ธ๊ฐ€์š”?
์•„๋‹ˆ๋ฉด ๊ฒฐ๊ณผ๋ฌผ๋„ ์ด๋ฏธ์ง€ ์‹œํ€€์Šค ์ธ๊ฐ€์š”?
๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉํ•œ ๋ช…๋ น์€ test_custom.sh ์ž…๋‹ˆ๋‹ค.

๊ฑดํˆฌ๋ฅผ ๋นŒ๋ฉฐ...

I tried frame interpolation with your code,
The code runs fine using the GPU.
I don't know where the output is generated.
Is the result a single video file?
Or is the result also an image sequence?
I wonder.

Best of luck...
@bemoregt.

Evaluation on UCF-101

HI,

Can you explain how did you evaluate on the available test split of UCF-101? What frames are the inputs and outputs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.