Focal Loss for Dense Object Detection
Method | training set | val set | mAP |
---|---|---|---|
Cross Entropy Loss | VOC2007 | VOC2007 | 63.36 |
Focal Loss | VOC2007 | VOC2007 | 65.26 |
A PyTorch Implementation of Focal Loss.
License: MIT License
Focal Loss for Dense Object Detection
Method | training set | val set | mAP |
---|---|---|---|
Cross Entropy Loss | VOC2007 | VOC2007 | 63.36 |
Focal Loss | VOC2007 | VOC2007 | 65.26 |
focal_loss_pytorch/focalloss.py
Lines 17 to 19 in e11e75b
hi, thank you for your code. and i have a question, why you can`t reshape the tensor from N,C,H,W => N* H* W,C directly
It seems the loss=-alpha*(1-y')^{gamma}log(y') when y=1 in the paper.
But in the code:
loss=-(1-alpha)*(1-y')^{gamma}log(y') when y=1
So the alpha should be set in an oppisite way?...
In the code:
{
(alpha,(float,int)): self.alpha = torch.Tensor([alpha,1-alpha]),
at = self.alpha.gather(0,target.data.view(-1))
logpt = logpt * Variable(at)
}
Is this right?
Hello! Thank for your focal loss implementation. But I have question. I guessed we also have to consider negative cases of confidence in loss, I mean cases when pt = 1 - p. I see there are only pt = p(positive) cases, but no pt = 1 - p cases. Could you comment it?
~/FullyConnected/focalloss.py in forward(self, input, target)
32
33 loss = -1 * (1-pt)**self.gamma * logpt
---> 34 if self.size_average: return loss.mean()
35 else: return loss.sum()
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generated/../THCReduceAll.cuh:339
I notice that it seems focal_loss.py
only implements focal loss on the binary classification setting. Do there exist any implementations about focal loss for a multi-class classification setting?
When I run this program , an error occurs:
"NameError: name 'long' is not defined"
it points to line 11 of file focalloss.py:
if isinstance(alpha,(float,int,long)): self.alpha = torch.Tensor([alpha,1-alpha])
gamma, alpha and size_average.
What are your recommendation when you init the loss function?
How gamma and alpha affect the result?
Thank you very much @clcarwin
Hi ๐
I've accidentally stumbled upon your code. Looks nice, but I have a question though.
Why do you use Softmax
instead of Sigmoid
function?
focal_loss_pytorch/focalloss.py
Line 22 in e11e75b
In the paper authors clearly stated the usage of sigmoid function, which makes sense. On the contrary, softmax doesn't (at least, it seems to me)
As a reference, implementation in fvcore has sigmoid function
I would be glad if someone could explain to me, whether it's a bug or there's some intuition behind this.
Thanks in advance!
I find the calculation of pt using the data, which means your implementation not allowing pt Backpropagatio. So I argue should this allow bp?
The paper mentions that the loss layer is combined with the sigmoid computation and not softmax. More speciafically this line
Finally,
we note that the implementation of the loss layer combines
the sigmoid operation for computing p with the loss computation, resulting in greater numerical stability.
So isn't the author saying that we should use sigmoid activation over the last layer. The softmax usage maybe could lead to a lower accuracy.
Should the loss, do a mean over all the classes of an anchor. For example if data comes in like this:
N * (Anum_classes) * H * W, where A is the number of anchors per cell. H, = height of feature map and W = width of feature map. Our targets are NAHW.
Now, we need to do a mean over each the num_classes of each A, of our loss computed over our N * (Anum_classes) * H * W input, to get it to NAHW.
This NAH*W should be summed over and then normalised by the number of anchors assigned to a ground truth box only, as the paper states the rest of the value is effectively zero.
During the adjustment process of gamma, when the value is between 0-1, the loss function has a gradient explosion. If it exceeds 1, it will not. May I ask what caused it
In my experiments, the the loss of FocalLoss with gamma=0 is much lower than the loss of CrossEntropyLoss. What makes it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.