minghchen / carl_code Goto Github PK

View Code? Open in Web Editor NEW

81.0 81.0 9.0 45.67 MB

Pytorch code for Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning, CVPR2022.

License: MIT License

Python 99.56% Shell 0.44%

carl_code's People

Stargazers

Watchers

Forkers

shuowang-ai moshebeutel pinganan sakura2233565548 mingu6 cesar64-garcia skyqwe123 jinhoyi

carl_code's Issues

code confusion and inconsistent reproduction results

Congratulations, you've done a fantasitic job. I try to reproduce your work, but find some problems.
First, i noticed that the parameter DATA.SAMPLING_REGION in *_config.yml refer to alpha variant in your paper., and block_size in dataset class's sample_frame function means sample size like Pouring. Should it be the production of expand_ratio and num_frames rather than expand_ratio and seq_len like

def sample_frames(self, seq_len, num_frames, pre_steps=None):
        ...
        elif sampling_strategy == 'time_augment':
            num_valid = min(seq_len, num_frames)
            expand_ratio = np.random.uniform(low=1.0, high=self.cfg.DATA.SAMPLING_REGION) if self.cfg.DATA.SAMPLING_REGION>1 else 1.0
            # block_size = math.ceil(expand_ratio*seq_len)
            block_size = math.ceil(expand_ratio*num_frames)

Second, i run training process in Pouring Dataset but only got 90.3% classification accuracy. It's less than 93.73% reported in your paper. Can you give some advice ? I use batch_size = 4, embedde_model.num_layers = 3, other parameter is the same with scl_transformer_config in your repo.

Problem downloading from Baidu

Hi
I'm trying to download the finegym dataset from Baidu
But it keeps prompting the dialog that wants me to select some rpm or deb file

It is worth mentioning that I cannot read nor understand Chinese :(

If it is not too big say up to 1 Gb I can send you a link to my drive

Thnx
Moshe

CARL embedding for single images

Hello, thank you for sharing your work.

I wanted to use the model trained with the CARL method to obtain an embedding for singular images (not part of a video).

Would it fine if I input the image with size (batch size, T, 3, 224, 224) with T = 1 to the model?

For a video input of size (batch size, T, 3, 224, 224) with T > 1, I observed a difference between embedding obtained from inputing the video at once or inputing every frame of the video one by one.

Request for Hardware and Runtime Information

Hello, if possible, could the authors provide recommended hardware (CPU/GPU counts) and expected runtimes for the jobs described in the "Training" section? I am attempting to run these jobs and data loading seems to be a major bottleneck for my runtime. For example, training one epoch on Penn Action is taking about 30 minutes each. Is this normal?

Question on Performance Reporting

Hi, I've been experimenting with your code and I've noticed that some of the metrics (specifically event_completion and retrieval) sometime peak in earlier epochs and then fall in later epochs. I am training 300 epochs and evaluating every 50 epochs as is the default for the configs.

I just wanted to check: for your results in "Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning" do you report the metrics for the "best" intermediate epoch per metric, or do you always report the results for the final epoch (300)? I searched over the paper and I could not find an explicit answer to this, so I wanted to ask here. Thanks!

When loading the first evaluation model, the model architechture is mismatch with the checkpoint

I place the 'checkpoint_epoch_00150.pth' into '~/tmp /scl_transformer_logs/checkpoints'
following same routine of ReadMe to load checkpoints, the error presented below:

How to solve it?
Thanks

torch.fx

torch.fx is released over pytorch 1.6.0, which is used in this project.

minghchen / carl_code Goto Github PK

carl_code's People

Stargazers

Watchers

Forkers

carl_code's Issues

code confusion and inconsistent reproduction results

Problem downloading from Baidu

CARL embedding for single images

Request for Hardware and Runtime Information

Question on Performance Reporting

When loading the first evaluation model, the model architechture is mismatch with the checkpoint

torch.fx

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent