Code Monkey home page Code Monkey logo

Comments (6)

jeya-maria-jose avatar jeya-maria-jose commented on May 18, 2024

You should remove those lines with mask if you are converting it to multi-class problem. Your ground truth should just contain pixels of values 0,1,2,3 if you are working on a 3-class classification problem.

from medical-transformer.

shanpriya3 avatar shanpriya3 commented on May 18, 2024

Hi, Thanks for your response. I did remove those 2 lines from my code. I have 3 classes/labels in total(including background) which have the values 0,1,2 in my ground-truth respectively. I also changed num_classes=3 in axialnet.py but when I run the code, I get this error. Does it have to do with the loss function? Do I need to change anything else? Could you please help me with this error?
image

from medical-transformer.

shanpriya3 avatar shanpriya3 commented on May 18, 2024

Could you please explain what does these lines(189-192) do in train.py?

tmp[tmp>=0.5] = 1
tmp[tmp<0.5] = 0
tmp2[tmp2>0] = 1
tmp2[tmp2<=0] = 0

and also why you do this(205-206)?
yHaT[yHaT==1] =255
yval[yval==1] =255

I have to remove these lines for my case, right?

from medical-transformer.

Qiang19990514 avatar Qiang19990514 commented on May 18, 2024

请问这个你是用的哪个数据集

from medical-transformer.

rw404 avatar rw404 commented on May 18, 2024

@shanpriya3, the code in lines 189-192 applies an aggressive softmax - i.e. translate all predictions into binary format (either 0 or 1) to then store the mask in the format described in the repository's Readme (values 255 correspond to the object, 0 to the background).

Lines 205-206 are needed for the mask saving format described in the repository:

  1. Based on the image, the model builds a response map: y_out = model(X_batch) on line 184;
  2. The image is converted to numpy format, then it is assumed that the output of the model contains a probability map of whether a pixel belongs to objects, i.e. [batch_size, channels, width, height] are translated into [batch_size, num_classes, width, height] (in this case, num_classes = 3), and each position of the result contains such a number from 0 to 1 that if you add by the number of classes (dim=1) result, then you get a map (batch maps) of identical units ([batch_size, num_classes, width, height].sum(dim=1) == 1*[batch_size, width, height] - the description is formal, just to add interpretability), BUT:
    • criterion = LogNLLLoss() is used as a criterion - line 111, however, this criterion is described in the metrics.py file on the 9th line and implements not LogNLLLoss, but CrossEntropy, that is, for predicting the model model(input) in the criterion object, softmax is applied first, so there is no used (in _forward_impl, forward methods).
    • Then the result in the validation part before calling tmp[tmp>=0.5] = 1 in line 189, you need to call Softmax-transformation in order to interpret the model prediction (raw data) as probabilities i.e. replace y_out = model(X_batch) in line 184 with, for example, y_out = model.soft(model(X_batch)) or y_out = torch.nn.functional.softmax(model(X_batch), dim=1) .

Then, as @jeya-maria-jose mentioned, instead of modifying the mask, you need to remove these lines and assume that gt(ground truth) should contain integer values of object classes (0, 1 or 2 in this case), also for a simpler interpretation, it is easier to save the predictions of the validation set not only for the 1st channel, i.e. maybe change line 214 to cv2.imwrite(fulldir+image_filename, yHaT[0,1:,:,:].transpose(1, 2, 0)) with optional zero-padding or keeping the background layer to avoid errors saving dual channel images. The resulting mask will have num_classes-1 (no background) layers, and each layer will contain 255 only if the corresponding object is detected by the model in this pixel (for example, in the first layer in the $(i, j)$ position there will be 255, which means in $(i, j)$ is an object of the 1st class, and if $(i, j)$ contains 255 in the second layer, then the object of the 2nd class is in this position).

from medical-transformer.

twofeetcat avatar twofeetcat commented on May 18, 2024

Hello, I have a question, in training,output = model(X_batch)contains a probability map of whether a pixel belongs to objects.Does that mean that the values in tensor are numbers from 0 to 1?
Secondly, in the training phase, do I need to process y_batch(in this case, num_classes = 20, pixels of values 0,1,2,3... 19), or directly calculate the loss of it and outputloss = criterion(output, y_batch)

from medical-transformer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.