In the file run_on_video/model_utils.py , the import s

Thank your reply. I use videoaudio.ckpt, get the error: <div class="snippet-clipb

If you see the given train , shape of features should be 2304(slowfast)+512(clip

I have the same issue. I believe the written on the repo should not produce thi

run_on_video error about qd-detr HOT 14 CLOSED

wangzhilong commented on June 12, 2024

run_on_video error

from qd-detr.

Comments (14)

wjun0830 commented on June 12, 2024 1

It seems that your feature size is also 512 that you also need to extract slowfast feature

from qd-detr.

wjun0830 commented on June 12, 2024

Sorry for the inconvenience.
I think that the pretrained weight is from Moment-DETR not from our GitHub repository.

Can you try again with the weights provided in our repository?

Video only weights : https://www.dropbox.com/s/yygwyljw8514d9r/videoonly.ckpt?dl=0
V + A weights : https://www.dropbox.com/s/hsc7jk21ppqasjt/videoaudio.ckpt?dl=0

from qd-detr.

wangzhilong commented on June 12, 2024

Thank your reply. I use videoaudio.ckpt, get the error:

  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for QDDETR:
        size mismatch for input_vid_proj.0.LayerNorm.weight: copying a param with shape torch.Size([4868]) from checkpoint, the shape in current model is torch.Size([2818]).
        size mismatch for input_vid_proj.0.LayerNorm.bias: copying a param with shape torch.Size([4868]) from checkpoint, the shape in current model is torch.Size([2818]).
        size mismatch for input_vid_proj.0.net.1.weight: copying a param with shape torch.Size([256, 4868]) from checkpoint, the shape in current model is torch.Size([256, 2818]).

from qd-detr.

wjun0830 commented on June 12, 2024

Can you try with the checkpoint trained only with video?
To use the video+audio checkpoint, you may have to change some code and your dataset to have extracted audio features.

from qd-detr.

wangzhilong commented on June 12, 2024

I have tried whe the checkpoint trained only with video: videoonly.ckpt, but error still happen。The shape of the model and the weights not match.

  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    input, self.normalized_shape, self.weight, self.bias, self.eps)
  File "/usr/local/lib64/python3.6/site-packages/torch/nn/functional.py", line 2347, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[2818], expected input with shape [*, 2818], but got input of size[1, 75, 514]

from qd-detr.

wjun0830 commented on June 12, 2024

If you see the given train script, shape of features should be 2304(slowfast)+512(clip).
It looks like you only have clip features.

from qd-detr.

nguyenquyem99dt commented on June 12, 2024

I also have an error when running run_on_video/run.py. I have used both videoonly.ckpt (https://www.dropbox.com/s/yygwyljw8514d9r/videoonly.ckpt?dl=0) and video_model_best.ckpt (run_on_video/qd_detr_ckpt/)

Error logs are below:

File "run_on_video/run.py", line 126, in
run_example()
File "run_on_video/run.py", line 109, in run_example
predictions = qd_detr_predictor.localize_moment(
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "run_on_video/run.py", line 57, in localize_moment
outputs = self.model(**model_inputs)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/QD-DETR/qd_detr/model.py", line 110, in forward
src_vid = self.input_vid_proj(src_vid)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/QD-DETR/qd_detr/model.py", line 505, in forward
x = self.LayerNorm(x)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward
return F.layer_norm(
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/functional.py", line 2503, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[2818], expected input with shape [, 2818], but got input of size[1, 75, 514]

from qd-detr.

hyperfraise commented on June 12, 2024

I have the same issue. I believe the script written on the repo should not produce this error if used as is.

from qd-detr.

wjun0830 commented on June 12, 2024

Hello.
For all of you in this thread, thank you for your interest, and sorry for the inconvenience.
I'll let you know through this thread when the model checkpoint trained only with CLIP features is ready.

Thanks.

from qd-detr.

wjun0830 commented on June 12, 2024

We've uploaded pretrained model only trained with CLIP features to support run on video.
You may try an example with it!
Thank you.

from qd-detr.

hyperfraise commented on June 12, 2024

Which one is it ?

from qd-detr.

wjun0830 commented on June 12, 2024

model_best.ckpt is the model trained with only Clip features.

from qd-detr.

hyperfraise commented on June 12, 2024

It now works thanks. I suggest to change the default model used on master.

from qd-detr.

wjun0830 commented on June 12, 2024

Thank you for the suggestion.
Do you mean to change the default loaded model in run_on_video/run.py?

from qd-detr.

run_on_video error about qd-detr HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent