Code Monkey home page Code Monkey logo

Comments (15)

AmingWu avatar AmingWu commented on June 9, 2024

Ok, I have solved this problem.

from openlongtailrecognition-oltr.

xavieryxie avatar xavieryxie commented on June 9, 2024

Hello,
When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

File "./models/MetaEmbeddingClassifier.py", line 33, in forward
dist_cur = torch.norm(x_expand - centroids_expand, 2, 2)
RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

Here, I print the shape of x_expand and centroids_expand.

torch.Size([86, 365, 512])
torch.Size([86, 122, 512])

Could you give some advice to solve this problem?

I also met this problem and it is odd that i got different errors when i run python main.py... in multiple times. Could you tell me how do you solve the problem? Thanks

from openlongtailrecognition-oltr.

AmingWu avatar AmingWu commented on June 9, 2024

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

from openlongtailrecognition-oltr.

xavieryxie avatar xavieryxie commented on June 9, 2024

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

Which python do you use? 2.7 or 3.5? i still got the same error, it is a little weird==

from openlongtailrecognition-oltr.

xavieryxie avatar xavieryxie commented on June 9, 2024

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

The problem has been solved. Thanks for your advice.

from openlongtailrecognition-oltr.

AmingWu avatar AmingWu commented on June 9, 2024

OK, When you have trained the model on the Place365 dataset, could you share your result with me?

from openlongtailrecognition-oltr.

xavieryxie avatar xavieryxie commented on June 9, 2024

Ok, No problem.

from openlongtailrecognition-oltr.

zhmiao avatar zhmiao commented on June 9, 2024

@AmingWu @onexxp Thank you very much for asking. The problem you have encountered was caused by the use of multi-GPU. We have had the same problem as well. Pytorch split the batch according to the number of available GPUs, such that the actual calculation in the code can cause problems because we assume the batch size should be fixed. (i.e. if we have 2 GPUs with batchsize=256, most likely in each GPU there would only be 128 samples, while all the other calculations are expecting 256 input samples). We did not prepare the code to be compatible with multi-GPU training/testing. We are sorry about this. It might need some extra effort to make it work.

from openlongtailrecognition-oltr.

xavieryxie avatar xavieryxie commented on June 9, 2024

@AmingWu @onexxp Thank you very much for asking. The problem you have encountered was caused by the use of multi-GPU. We have had the same problem as well. Pytorch split the batch according to the number of available GPUs, such that the actual calculation in the code can cause problems because we assume the batch size should be fixed. (i.e. if we have 2 GPUs with batchsize=256, most likely in each GPU there would only be 128 samples, while all the other calculations are expecting 256 input samples). We did not prepare the code to be compatible with multi-GPU training/testing. We are sorry about this. It might need some extra effort to make it work.

Thanks for your answering and awesome work. I met another problem when i use python3.5 to run the code. When we initial models, the feat/classifier param will become order-less for we define the param as a dict, and the code won't work. I change it to OrderedDict() and it works. I don't know if it is just me occur the problem. Just a little question.

from openlongtailrecognition-oltr.

xavieryxie avatar xavieryxie commented on June 9, 2024

OK, When you have trained the model on the Place365 dataset, could you share your result with me?

Have you trained the model? I use the default param to train the model, but the result seems a little lower than the paper reported. It is like: Many_shot_accuracy_top1: 0.412 Median_shot_accuracy_top1: 0.369 Low_shot_accuracy_top1: 0.218 on the closed-set.

from openlongtailrecognition-oltr.

AmingWu avatar AmingWu commented on June 9, 2024

My result is lower than your result.

from openlongtailrecognition-oltr.

AmingWu avatar AmingWu commented on June 9, 2024

Hello, for the Place_LT, the number of open set is 6600. But, when I run the openset test, I find the number is 43100. Why?

from openlongtailrecognition-oltr.

AmingWu avatar AmingWu commented on June 9, 2024

Hello, I have understood your setting.

from openlongtailrecognition-oltr.

zhmiao avatar zhmiao commented on June 9, 2024

@AmingWu As mentioned here: #17 (comment) we think we have found the problem why the inference results are a little bit lower than reported. We will fix this asap. Thank you very much.

from openlongtailrecognition-oltr.

zhmiao avatar zhmiao commented on June 9, 2024

@AmingWu #17 (comment)

from openlongtailrecognition-oltr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.