Code Error about openlongtailrecognition-oltr HOT 15 CLOSED

zhmiao commented on June 9, 2024

Code Error

from openlongtailrecognition-oltr.

Comments (15)

AmingWu commented on June 9, 2024

Ok, I have solved this problem.

from openlongtailrecognition-oltr.

xavieryxie commented on June 9, 2024

Hello,
When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

File "./models/MetaEmbeddingClassifier.py", line 33, in forward
dist_cur = torch.norm(x_expand - centroids_expand, 2, 2)
RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

Here, I print the shape of x_expand and centroids_expand.

torch.Size([86, 365, 512])
torch.Size([86, 122, 512])

Could you give some advice to solve this problem?

I also met this problem and it is odd that i got different errors when i run python main.py... in multiple times. Could you tell me how do you solve the problem? Thanks

from openlongtailrecognition-oltr.

AmingWu commented on June 9, 2024

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

from openlongtailrecognition-oltr.

xavieryxie commented on June 9, 2024

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

Which python do you use? 2.7 or 3.5? i still got the same error, it is a little weird==

from openlongtailrecognition-oltr.

xavieryxie commented on June 9, 2024

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

The problem has been solved. Thanks for your advice.

from openlongtailrecognition-oltr.

AmingWu commented on June 9, 2024

OK, When you have trained the model on the Place365 dataset, could you share your result with me?

from openlongtailrecognition-oltr.

xavieryxie commented on June 9, 2024

Ok, No problem.

from openlongtailrecognition-oltr.

zhmiao commented on June 9, 2024

@AmingWu @onexxp Thank you very much for asking. The problem you have encountered was caused by the use of multi-GPU. We have had the same problem as well. Pytorch split the batch according to the number of available GPUs, such that the actual calculation in the code can cause problems because we assume the batch size should be fixed. (i.e. if we have 2 GPUs with batchsize=256, most likely in each GPU there would only be 128 samples, while all the other calculations are expecting 256 input samples). We did not prepare the code to be compatible with multi-GPU training/testing. We are sorry about this. It might need some extra effort to make it work.

from openlongtailrecognition-oltr.

xavieryxie commented on June 9, 2024

@AmingWu @onexxp Thank you very much for asking. The problem you have encountered was caused by the use of multi-GPU. We have had the same problem as well. Pytorch split the batch according to the number of available GPUs, such that the actual calculation in the code can cause problems because we assume the batch size should be fixed. (i.e. if we have 2 GPUs with batchsize=256, most likely in each GPU there would only be 128 samples, while all the other calculations are expecting 256 input samples). We did not prepare the code to be compatible with multi-GPU training/testing. We are sorry about this. It might need some extra effort to make it work.

Thanks for your answering and awesome work. I met another problem when i use python3.5 to run the code. When we initial models, the feat/classifier param will become order-less for we define the param as a dict, and the code won't work. I change it to OrderedDict() and it works. I don't know if it is just me occur the problem. Just a little question.

from openlongtailrecognition-oltr.

xavieryxie commented on June 9, 2024

OK, When you have trained the model on the Place365 dataset, could you share your result with me?

Have you trained the model? I use the default param to train the model, but the result seems a little lower than the paper reported. It is like: Many_shot_accuracy_top1: 0.412 Median_shot_accuracy_top1: 0.369 Low_shot_accuracy_top1: 0.218 on the closed-set.

from openlongtailrecognition-oltr.

AmingWu commented on June 9, 2024

My result is lower than your result.

from openlongtailrecognition-oltr.

AmingWu commented on June 9, 2024

Hello, for the Place_LT, the number of open set is 6600. But, when I run the openset test, I find the number is 43100. Why?

from openlongtailrecognition-oltr.

AmingWu commented on June 9, 2024

Hello, I have understood your setting.

from openlongtailrecognition-oltr.

zhmiao commented on June 9, 2024

@AmingWu As mentioned here: #17 (comment) we think we have found the problem why the inference results are a little bit lower than reported. We will fix this asap. Thank you very much.

from openlongtailrecognition-oltr.

zhmiao commented on June 9, 2024

@AmingWu #17 (comment)

from openlongtailrecognition-oltr.

Code Error about openlongtailrecognition-oltr HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent