houzhijian / groundnlq Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 1.0 2.65 MB

The champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023

License: MIT License

Python 97.26% C++ 2.24% Shell 0.50%

egocentric-vision video-language-understanding

groundnlq's People

Contributors

Stargazers

Watchers

Forkers

ayiyayi

groundnlq's Issues

Can I train without the upcoming Ego4D-NLQ data, including files, features, and pretrained weights?

I am suffering from the following error. I have looked into the cause and thought that I might need the upcoming data.
Please let me know if it is possible to train without the upcoming data

Traceback (most recent call last): File "train_ft.py", line 240, in <module> main(args) File "train_ft.py", line 66, in main train_dataset = make_dataset( File "/home/GroundNLQ/GroundNLQ/libs/datasets/datasets.py", line 19, in make_dataset dataset = datasets[name](is_training, split, val_jsonl_file, **kwargs) File "/home/GroundNLQ/GroundNLQ/libs/datasets/ego4d_loader.py", line 36, in __init__ assert os.path.exists(video_feat_dir) AssertionError

Where is process_train_split.py file?

where is process_train_split.py file?

Fail to reproduce the "Training-From-Scratch" performance

Thanks for your impressive work.

I try to reproduce the "Training-From-Scratch" performance using: bash tools/train_ego4d_twogpu.sh configs/ego4d_nlq_v2_internvideo_1e-4.yaml scratch_2gpu 0,1. The evaluation result is about 14-15 R@1, IoU=0.3 on the validation set, significantly lower than the reported number.
Could you please give me some advice on reproducing the "Training-From-Scratch" performance?

First, there's a typo in README. I think the "Training-From-Scratch" command should be bash tools/train_ego4d_twogpu.sh configs/ego4d_nlq_v2_internvideo_1e-4.yaml scratch_2gpu 0,1.

Moreover, trunc_thresh and crop_ratio in ego4d_nlq_v2_internvideo_1e-4.yaml have to be deleted.

Further, the learning_rate in ego4d_nlq_v2_internvideo_1e-4.yaml is 5e-5, contradicting with the config name.

Could you please check these potential mistakes and provide a reproducible config?

RuntimeError: Given groups=1, weight of size [384, 2304, 3], expected input[2, 256, 2560] to have 2304 channels, but got 256 channels instead

Hi,

I followed the instruction to download the video features and convert them to lmdb,
however, when I ran the pretrain script, this runtimeerror occured.

RuntimeError: Given groups=1, weight of size [384, 2304, 3], expected input[2, 256, 2560] to have 2304 channels, but got 256 channels instead

Would you please help to deal with this problem?
Thank you every much.

Directory em_narration_clip_token_features and nlq_v2_clip_token_features not found

Hi!
Thanks for your code sharing. I met some error when using bash tools/pretrain_ego4d_narration.sh CONFIG_FILE OUTPUT_PATH, could yout please tell me how to get directory em_narration_clip_token_features and nlq_v2_clip_token_features?
Thank you every much. ^_^

houzhijian / groundnlq Goto Github PK

groundnlq's People

Contributors

Stargazers

Watchers

Forkers

groundnlq's Issues

Can I train without the upcoming Ego4D-NLQ data, including files, features, and pretrained weights?

Where is process_train_split.py file?

Fail to reproduce the "Training-From-Scratch" performance

RuntimeError: Given groups=1, weight of size [384, 2304, 3], expected input[2, 256, 2560] to have 2304 channels, but got 256 channels instead

Directory em_narration_clip_token_features and nlq_v2_clip_token_features not found

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent