Code Monkey home page Code Monkey logo

Comments (5)

alexlopezcifuentes avatar alexlopezcifuentes commented on June 7, 2024 1

Because 0 might be a semantic label. Which in the case of the ADE20K dataset its related to "Wall" class.

from semantic-aware-scene-recognition.

alexlopezcifuentes avatar alexlopezcifuentes commented on June 7, 2024

Hi!

Thanks for your question!

If your semantic segmentation model outputs a score tensor Y with size L x W x H (with normalized probabilities, i.e. between 0 and 1) you need to sort Y and select the 3 labels with a higher probability for each pixel. By that, you will have the Top@3 labels with their respective Top@3 scores.

You can do that on the fly in the training and evaluation loop, however, in order to save time, we decided to save that labels and scores to images so the prediction is only done once. To save 3 semantic labels per pixel we decided to encode them into the three channels of an RGB image.

from semantic-aware-scene-recognition.

neoyang0620 avatar neoyang0620 commented on June 7, 2024

Thank for your explanation. I still have one further question. According to you paper, the semantic segmentation score is set to "0", if it is not one of the top3 predictions. However, there are many zero regions in the segmentation-precomputed SUN397 data. How is that possible?

Best,
Neo

from semantic-aware-scene-recognition.

neoyang0620 avatar neoyang0620 commented on June 7, 2024

For example, sem_score file './Data/Datasets/SUN397/noisy_scores_RGB/val/airplane_cabin/sun_akzqlgepekqslhbn.png' contains many zero values. I don't think there should be zero values.

from semantic-aware-scene-recognition.

neoyang0620 avatar neoyang0620 commented on June 7, 2024

if i understand correctly, the original output from semantic segmentation network is a Scores (150xWxH) matrix.

  1. we normalize the distribution at each pixel ==> torch.sum(Score [i,:,:]) == 1. the output of this stage is Scores_norm (150xWxH)
  2. we generate the sem_scores (3xWxH) by extracting the top3 scores at each pixel. sem_scores[0,i,j] represents the top-1 scores at pixel(i,j).
  3. we retrieve the index of top3 scores as sem_labels (3xWxH). The matrix is divided by 255.

Following such procedure, the value of sem_score cannot be zero. Do i miss any steps?

from semantic-aware-scene-recognition.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.