Code Monkey home page Code Monkey logo

Comments (14)

wjun0830 avatar wjun0830 commented on June 12, 2024 1

It seems that your feature size is also 512 that you also need to extract slowfast feature

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

Sorry for the inconvenience.
I think that the pretrained weight is from Moment-DETR not from our GitHub repository.

Can you try again with the weights provided in our repository?

Video only weights : https://www.dropbox.com/s/yygwyljw8514d9r/videoonly.ckpt?dl=0
V + A weights : https://www.dropbox.com/s/hsc7jk21ppqasjt/videoaudio.ckpt?dl=0

from qd-detr.

wangzhilong avatar wangzhilong commented on June 12, 2024

Thank your reply. I use videoaudio.ckpt, get the error:

  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for QDDETR:
        size mismatch for input_vid_proj.0.LayerNorm.weight: copying a param with shape torch.Size([4868]) from checkpoint, the shape in current model is torch.Size([2818]).
        size mismatch for input_vid_proj.0.LayerNorm.bias: copying a param with shape torch.Size([4868]) from checkpoint, the shape in current model is torch.Size([2818]).
        size mismatch for input_vid_proj.0.net.1.weight: copying a param with shape torch.Size([256, 4868]) from checkpoint, the shape in current model is torch.Size([256, 2818]).

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

Can you try with the checkpoint trained only with video?
To use the video+audio checkpoint, you may have to change some code and your dataset to have extracted audio features.

from qd-detr.

wangzhilong avatar wangzhilong commented on June 12, 2024

I have tried whe the checkpoint trained only with video: videoonly.ckpt, but error still happen。The shape of the model and the weights not match.

  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    input, self.normalized_shape, self.weight, self.bias, self.eps)
  File "/usr/local/lib64/python3.6/site-packages/torch/nn/functional.py", line 2347, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[2818], expected input with shape [*, 2818], but got input of size[1, 75, 514]

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

If you see the given train script, shape of features should be 2304(slowfast)+512(clip).
It looks like you only have clip features.

from qd-detr.

nguyenquyem99dt avatar nguyenquyem99dt commented on June 12, 2024

I also have an error when running run_on_video/run.py. I have used both videoonly.ckpt (https://www.dropbox.com/s/yygwyljw8514d9r/videoonly.ckpt?dl=0) and video_model_best.ckpt (run_on_video/qd_detr_ckpt/)

Error logs are below:

File "run_on_video/run.py", line 126, in
run_example()
File "run_on_video/run.py", line 109, in run_example
predictions = qd_detr_predictor.localize_moment(
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "run_on_video/run.py", line 57, in localize_moment
outputs = self.model(**model_inputs)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/QD-DETR/qd_detr/model.py", line 110, in forward
src_vid = self.input_vid_proj(src_vid)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/QD-DETR/qd_detr/model.py", line 505, in forward
x = self.LayerNorm(x)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, **kwargs)
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward
return F.layer_norm(
File "/home/ubuntu/projects/moment-retrieval/envs/moment-detr/lib/python3.8/site-packages/torch/nn/functional.py", line 2503, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[2818], expected input with shape [
, 2818], but got input of size[1, 75, 514]

from qd-detr.

hyperfraise avatar hyperfraise commented on June 12, 2024

I have the same issue. I believe the script written on the repo should not produce this error if used as is.

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

Hello.
For all of you in this thread, thank you for your interest, and sorry for the inconvenience.
I'll let you know through this thread when the model checkpoint trained only with CLIP features is ready.

Thanks.

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

We've uploaded pretrained model only trained with CLIP features to support run on video.
You may try an example with it!
Thank you.

from qd-detr.

hyperfraise avatar hyperfraise commented on June 12, 2024

Which one is it ?

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

model_best.ckpt is the model trained with only Clip features.

from qd-detr.

hyperfraise avatar hyperfraise commented on June 12, 2024

It now works thanks. I suggest to change the default model used on master.

from qd-detr.

wjun0830 avatar wjun0830 commented on June 12, 2024

Thank you for the suggestion.
Do you mean to change the default loaded model in run_on_video/run.py?

from qd-detr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.