daveishan / tclr Goto Github PK

View Code? Open in Web Editor NEW

30.0 1.0 2.0 61 KB

Official code repo for TCLR: Temporal Contrastive Learning for Video Representation [CVIU-2022]

License: MIT License

Python 100.00%

self-supervised-learning action-recognition learning-with-limited-labeled-data pytorch

tclr's People

Contributors

Stargazers

Watchers

Forkers

shravankumar147 nayeemrizve

tclr's Issues

Some questions from a beginner

Hi I'm Jimmy. I'm currently learning the video area in computer vision. I wonder is it possible to ask you some questions here? I would really appreciate for your help because I have tried to searched these questions online, but could not solve it by myself. :(

At this line, I'm quite confused because are we supposed to input video data? Why do we need this? I know 5 is the batch size, 3 is the input channels, 16 is the number of frames input to the network, and the width and height are 112. Am I right?
For line 15 and 17, why do we need to expand the layer? Why not just use the original resnet 18 as backbone?
why do we need to define sparse_clip, dense_clip0, dense_clip1, dense_clip2, dense_clip3...? I'm not pretty sure the purpose of this, and there is no definition of sparse clip and dense clip in the paper.
What is the difference between input plane and plane?Does inplane mean channel size?How about plane?
Also if I want to use a different dataset to experiment the performance of this model, is there any big modification I need to notice?

Sorry to bother you. Thank you so much.

RuntimeError: stack expects a non-empty TensorList

After configuring the code when I am running it using the command:

python3 complete_retrieval.py --saved_model="../models/model_best_e247_loss_9.7173.pth"

I am getting the following Error

Clip ../data/UCF-101/ShavingBeard/v_ShavingBeard_g05_c06.avi Failed
Traceback (most recent call last):
  File "complete_retrieval.py", line 253, in <module>
    train_classifier(str(run_id), arch, str(saved_model), modes)
  File "complete_retrieval.py", line 132, in train_classifier
    pred_dict, label_dict = val_epoch(len(modes), run_id, epoch, modes[val_iter], skip[val_iter],
  File "complete_retrieval.py", line 20, in val_epoch
    for i, (inputs, label, vid_path, _) in enumerate(data_loader):
  File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/multi-sy-15/PycharmProjects/TCLR/nn_retrieval/dl_ret.py", line 283, in collate_fn2
    f_clip = torch.stack(f_clip, dim=0)
RuntimeError: stack expects a non-empty TensorList

When will the code be released

Dear @DAVEISHAN
Your research work is very meaningful and I am very interested.
But I sincerely want to know when the code can be released. I'm very eager to learn useful knowledge.
Best wishes to you！

issue

Embedding Size

Hello Sir

Thank you very much for sharing your code.
In your NN Retrieval code, you stated that the train features has size of 9537 x 4096
and the val feature has size of 3792 x 4096 for the UCF-101.

I am wondering if the r3d encoder outputs 512 x 4 which is equal to 2048 after flattening ,
how you managed to get an embedding of size 4096 ?

I tried the code and it outputs an embedding of size 2048, did I miss something here?
What is the size of the embedding for the NN-Retrieval experiments reported in the paper?

In addition, I used the provided UCF pre-trained model to run the retrieval experiment on UCF and
got lower accuracies than the paper:

torch.Size([9537, 2048]) - train features
torch.Size([3783, 2048]) - val feature

Top-1 correct is 50.44%
Top-5 correct is 69.89%
Top-10 correct is 77.27%
Top-20 correct is 84.46%
50.44, 69.89, 77.27, 84.46

Best Regards,
Hussein

daveishan / tclr Goto Github PK

tclr's People

Contributors

Stargazers

Watchers

Forkers

tclr's Issues

Some questions from a beginner

RuntimeError: stack expects a non-empty TensorList

When will the code be released

issue

Embedding Size

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent