Hi, I have a ground-truth with 3 classes including background with values 0,127,255. A

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Multi-label Segmentation about medical-transformer HOT 6 OPEN

shanpriya3 commented on May 18, 2024

Multi-label Segmentation

from medical-transformer.

Comments (6)

jeya-maria-jose commented on May 18, 2024

You should remove those lines with mask if you are converting it to multi-class problem. Your ground truth should just contain pixels of values 0,1,2,3 if you are working on a 3-class classification problem.

from medical-transformer.

shanpriya3 commented on May 18, 2024

Hi, Thanks for your response. I did remove those 2 lines from my code. I have 3 classes/labels in total(including background) which have the values 0,1,2 in my ground-truth respectively. I also changed num_classes=3 in axialnet.py but when I run the code, I get this error. Does it have to do with the loss function? Do I need to change anything else? Could you please help me with this error?

from medical-transformer.

shanpriya3 commented on May 18, 2024

Could you please explain what does these lines(189-192) do in train.py?

tmp[tmp>=0.5] = 1
tmp[tmp<0.5] = 0
tmp2[tmp2>0] = 1
tmp2[tmp2<=0] = 0

and also why you do this(205-206)?
yHaT[yHaT==1] =255
yval[yval==1] =255

I have to remove these lines for my case, right?

from medical-transformer.

Qiang19990514 commented on May 18, 2024

请问这个你是用的哪个数据集

from medical-transformer.

rw404 commented on May 18, 2024

@shanpriya3, the code in lines 189-192 applies an aggressive softmax - i.e. translate all predictions into binary format (either 0 or 1) to then store the mask in the format described in the repository's Readme (values 255 correspond to the object, 0 to the background).

Lines 205-206 are needed for the mask saving format described in the repository:

Based on the image, the model builds a response map: y_out = model(X_batch) on line 184;
The image is converted to numpy format, then it is assumed that the output of the model contains a probability map of whether a pixel belongs to objects, i.e. [batch_size, channels, width, height] are translated into [batch_size, num_classes, width, height] (in this case, num_classes = 3), and each position of the result contains such a number from 0 to 1 that if you add by the number of classes (dim=1) result, then you get a map (batch maps) of identical units ([batch_size, num_classes, width, height].sum(dim=1) == 1*[batch_size, width, height] - the description is formal, just to add interpretability), BUT:
- criterion = LogNLLLoss() is used as a criterion - line 111, however, this criterion is described in the metrics.py file on the 9th line and implements not LogNLLLoss, but CrossEntropy, that is, for predicting the model model(input) in the criterion object, softmax is applied first, so there is no used (in _forward_impl, forward methods).
- Then the result in the validation part before calling tmp[tmp>=0.5] = 1 in line 189, you need to call Softmax-transformation in order to interpret the model prediction (raw data) as probabilities i.e. replace y_out = model(X_batch) in line 184 with, for example, y_out = model.soft(model(X_batch)) or y_out = torch.nn.functional.softmax(model(X_batch), dim=1) .

Then, as @jeya-maria-jose mentioned, instead of modifying the mask, you need to remove these lines and assume that gt(ground truth) should contain integer values of object classes (0, 1 or 2 in this case), also for a simpler interpretation, it is easier to save the predictions of the validation set not only for the 1st channel, i.e. maybe change line 214 to cv2.imwrite(fulldir+image_filename, yHaT[0,1:,:,:].transpose(1, 2, 0)) with optional zero-padding or keeping the background layer to avoid errors saving dual channel images. The resulting mask will have num_classes-1 (no background) layers, and each layer will contain 255 only if the corresponding object is detected by the model in this pixel (for example, in the first layer in the $(i, j)$ position there will be 255, which means in $(i, j)$ is an object of the 1st class, and if $(i, j)$ contains 255 in the second layer, then the object of the 2nd class is in this position).

from medical-transformer.

twofeetcat commented on May 18, 2024

Hello, I have a question, in training,output = model(X_batch)contains a probability map of whether a pixel belongs to objects.Does that mean that the values in tensor are numbers from 0 to 1?
Secondly, in the training phase, do I need to process y_batch(in this case, num_classes = 20, pixels of values 0,1,2,3... 19), or directly calculate the loss of it and outputloss = criterion(output, y_batch)

from medical-transformer.

Multi-label Segmentation about medical-transformer HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent