daveishan / tclr Goto Github PK
View Code? Open in Web Editor NEWOfficial code repo for TCLR: Temporal Contrastive Learning for Video Representation [CVIU-2022]
License: MIT License
Official code repo for TCLR: Temporal Contrastive Learning for Video Representation [CVIU-2022]
License: MIT License
Hi I'm Jimmy. I'm currently learning the video area in computer vision. I wonder is it possible to ask you some questions here? I would really appreciate for your help because I have tried to searched these questions online, but could not solve it by myself. :(
What is the difference between input plane and plane?Does inplane mean channel size?How about plane?
Also if I want to use a different dataset to experiment the performance of this model, is there any big modification I need to notice?
Sorry to bother you. Thank you so much.
After configuring the code when I am running it using the command:
python3 complete_retrieval.py --saved_model="../models/model_best_e247_loss_9.7173.pth"
I am getting the following Error
Clip ../data/UCF-101/ShavingBeard/v_ShavingBeard_g05_c06.avi Failed
Traceback (most recent call last):
File "complete_retrieval.py", line 253, in <module>
train_classifier(str(run_id), arch, str(saved_model), modes)
File "complete_retrieval.py", line 132, in train_classifier
pred_dict, label_dict = val_epoch(len(modes), run_id, epoch, modes[val_iter], skip[val_iter],
File "complete_retrieval.py", line 20, in val_epoch
for i, (inputs, label, vid_path, _) in enumerate(data_loader):
File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/multi-sy-15/PycharmProjects/TCLR/venv_tclr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/multi-sy-15/PycharmProjects/TCLR/nn_retrieval/dl_ret.py", line 283, in collate_fn2
f_clip = torch.stack(f_clip, dim=0)
RuntimeError: stack expects a non-empty TensorList
Dear @DAVEISHAN
Your research work is very meaningful and I am very interested.
But I sincerely want to know when the code can be released. I'm very eager to learn useful knowledge.
Best wishes to you!
Hello Sir
Thank you very much for sharing your code.
In your NN Retrieval code, you stated that the train features has size of 9537 x 4096
and the val feature has size of 3792 x 4096 for the UCF-101.
I am wondering if the r3d encoder outputs 512 x 4 which is equal to 2048 after flattening ,
how you managed to get an embedding of size 4096 ?
I tried the code and it outputs an embedding of size 2048, did I miss something here?
What is the size of the embedding for the NN-Retrieval experiments reported in the paper?
In addition, I used the provided UCF pre-trained model to run the retrieval experiment on UCF and
got lower accuracies than the paper:
torch.Size([9537, 2048]) - train features
torch.Size([3783, 2048]) - val feature
Top-1 correct is 50.44%
Top-5 correct is 69.89%
Top-10 correct is 77.27%
Top-20 correct is 84.46%
50.44, 69.89, 77.27, 84.46
Best Regards,
Hussein
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.