Code Monkey home page Code Monkey logo

focal-loss-implement-on-tensorflow's People

Contributors

ailias avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

focal-loss-implement-on-tensorflow's Issues

∵0**0=1 ∴ gamma != 0

增加 gamma = 0 的情况的讨论

def focal_loss_sigmoid(prediction_tensor, target_tensor, weights=None, alpha=0.5, gamma=0):
    target_tensor = tf.cast(target_tensor, tf.float32)
    sigmoid_p = tf.nn.sigmoid(prediction_tensor)
    zeros = array_ops.zeros_like(sigmoid_p, dtype=sigmoid_p.dtype)

    pos_p_sub = array_ops.where(target_tensor > zeros, target_tensor - sigmoid_p, zeros)
    neg_p_sub = array_ops.where(target_tensor > zeros, zeros, sigmoid_p)

    if gamma != 0:
        per_entry_cross_ent = - alpha * (pos_p_sub ** gamma) * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0)) \
                              - (1 - alpha) * (neg_p_sub ** gamma) * tf.log(tf.clip_by_value(1.0 - sigmoid_p, 1e-8, 1.0))
    else:
        # ∵0**0=1 
        per_entry_cross_ent = - alpha * target_tensor * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0)) \
                              - (1 - alpha) * (1 - target_tensor) * tf.log(tf.clip_by_value(1.0 - sigmoid_p, 1e-8, 1.0))

    return per_entry_cross_ent

The issue about @slim.add_arg_scope

I'm sorry that I am a newcomer to tensorflow, i am no familiar with the [email protected]_arg_scope at about line 45, is that mean you want to share the default params to the function (_inverted_residual_bottleneck), but if it just what i think, why do you not set the same [email protected]_arg_scope just at the func(mobilenet_v2_base)? And why we should set the @slim.add_arg_scope?I just think that slim.conv2d and slim.separable_conv2d can get the default params automatically. May the issue is foolish, please help me!

Question on MobileNet SSD architechture with focal loss.

Hello! First of all thanks for your implementation. You're really awesome.
My question is how to merger the focal loss with SSD architecture as I'm know working on SSD for my project.

  1. Is it correct that we just replace the original softmax loss by focal loss? Or, it is necessary to apply it to location loss as well?
  2. As the strength of focal loss is to solve the class imbalance, should I remove the the hard negative mining operations mentioned in SSD paper? What's your idea when your implementation?

Thanks a lot for your brilliant work and patience to read these questions. Look forward your reply.

Focal loss value is very less

Hi,

I have implemented your code in Pytorch and it worked properly but have the following concerns

My sudo code works like this
cls_targets = [batch_size, anchor_boxes, classes] # classes is 21 (voc_labels+background) [16, 67995, 21]
cls_preds = [batch_size, anchor_boxes] # anchor_boxes number ranges from -1 to 20 [67995, 21]

Now I remove all the anchor boxes with -1 (ignore_boxes)
cls_targets = [batch_size * valid_anchor_boxes, classes] # [54933, 21]
cls_preds = [batch_size * valid_anchor_boxes, classes] # [54933, 21] This is one hot encoding vector

Now, I followed your code and implemented focal loss as it is but My loss values are coming very less. Like random values is giving a score of 0.12 and quickly the loss is going 0.0012 and small

is der I am missing something:

class FocalLoss_tensorflow(nn.Module):
    def __init__(self, num_classes=20,
                focusing_param = 2.0, 
                balance_param=0.25):
        super(FocalLoss_2, self).__init__()
        self.num_classes = num_classes
        self.focusing_param = focusing_param
        self.balance_param = balance_param 
    
    def focal_loss(self, x, y):
        """ https://github.com/ailias/Focal-Loss-implement-on-Tensorflow/blob/master/focal_loss.py 
        everywhere people are just talking about num_classes. So lets remove the background class from focal loss calculation.
        """
        x  = x[:, 1:]
        sigmoid_p = F.sigmoid(x)
        anchors, classes = x.shape 
        
        t = torch.FloatTensor(anchors, classes+1)
        t.zero_()
        t.scatter_(1, y.data.cpu().view(-1, 1), 1)
        t = Variable(t[:, 1:]).cuda()
        
        zeros = Variable(torch.zeros(sigmoid_p.size())).cuda()
        pos_p_sub = ((t >= sigmoid_p).float() * (t-sigmoid_p)) + ((t < sigmoid_p).float() * zeros)
        neg_p_sub = ((t >= zeros).float() * zeros) + ((t <= zeros).float() * sigmoid_p)
        
        per_entry_cross_ent = (-1) * self.balance_param * (pos_p_sub ** self.focusing_param) * torch.log(torch.clamp(sigmoid_p, 1e-8, 1.0)) -(1-self.balance_param) * (neg_p_sub ** self.focusing_param) * torch.log(torch.clamp(1.0-sigmoid_p, 1e-8, 1.0))
        return per_entry_cross_ent.mean()
        
        
    
    def forward(self, loc_preds, loc_targets, cls_preds, cls_targets):
        batch_size, num_boxes = cls_targets.size()
        pos = cls_targets > 0
        num_pos = pos.data.long().sum()

        mask = pos.unsqueeze(2).expand_as(loc_preds)
        masked_loc_preds = loc_preds[mask].view(-1,4)
        masked_loc_targets = loc_targets[mask].view(-1,4)
        loc_loss = F.smooth_l1_loss(masked_loc_preds, masked_loc_targets, size_average=False)
        loc_loss = loc_loss/num_pos

        pos_neg = cls_targets > -1
        mask = pos_neg.unsqueeze(2).expand_as(cls_preds)
        masked_cls_preds = cls_preds[mask].view(-1, self.num_classes)
        cls_loss = self.focal_loss(masked_cls_preds, cls_targets[pos_neg])
        return loc_loss, cls_loss

Question1:
I am still not getting quite write, if I should use 0 as my background class and how normalization is done while focal loss is applied.

Question2:
I see u have taken mean() but the papers says we need to sum and normalize with positive anchors. Does positive anchors mean only positive anchor boxes are all valid anchor boxes ?

Question3:
The graphs presented by you shows that the loss starts from 6.45... and decreases but mine starts from 0.12 and quickly drops to small decimals..

what does array_ops.where mean?

Thanks for your sharing. I want to know what array_ops.where means. Is it similar to tf.where?
Besides, I use it for SSD-Tensorflow, and loss is very high. I make some changes.

sigmoid_p = tf.nn.sigmoid(prediction_tensor)
zeros = array_ops.zeros_like(sigmoid_p, dtype=sigmoid_p.dtype)

t = tf.one_hot(indices = target_tensor, depth = 4)
t_tensor = tf.cast(t, sigmoid_p.dtype)  

pos_p_sub = tf.where(t_tensor > zeros, t_tensor - sigmoid_p, zeros)

Could you tell me why?

多分类时参数α的意义是什么?

原论文中,α的作用是增大数量少的类别的权重,即对正样本权重为α=0.25,对负样本权重为1-α=0.75,这样增大对正样本的学习能力。
在这里,模型输出经过sigmoid,然后沿用论文中focal_loss的计算方式,使我不太清楚这里α的意义。
因为,这里的α对于多个类别,是一致无差的,也就不能起到区别对待各个类别的作用。

Why add "neg_p_sub" into "per_entry_cross_ent"

For multi-label cases, I think there is no need to consider neg_p_sub while computing per_entry_cross_ent. I think we shouldn't take every class as a binary classification and sum up their binary-cross-entropy losses, but take all classes as a whole to compute the multiclass-cross-entropy loss.

The original code:

per_entry_cross_ent = - alpha * (pos_p_sub ** gamma) * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0)) \
                          - (1 - alpha) * (neg_p_sub ** gamma) * tf.log(tf.clip_by_value(1.0 - sigmoid_p, 1e-8, 1.0))

I think it should be:

per_entry_cross_ent = - alpha * (pos_p_sub ** gamma) * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0))

Should the tf.nn.sigmoid be replaced with tf.nn.sotfmax

Thx for your code!
I am confused with some source code.

https://github.com/ailias/Focal-Loss-implement-on-Tensorflow/blob/master/focal_loss.py

focal_loss.py line 22

sigmoid_p = tf.nn.sigmoid(prediction_tensor)
zeros = array_ops.zeros_like(sigmoid_p, dtype=sigmoid_p.dtype)

I think here we use tf.nn.sigmoid to calculate the probability is not suitable.

This is the sigmoid function:
image

And this is the softmax function:
image

Usually, we use softmax layer as the last layer to get the probability, so the sum is 1 and the margin between the output is increased.

因为0**0=1,所以需要增加γ=0时的特殊情况。否则会出现问题

#因为0**0=1,所以需要增加γ=0时的特殊情况

def focal_loss_sigmoid(prediction_tensor, target_tensor, weights=None, alpha=0.5, gamma=0):
    target_tensor = tf.cast(target_tensor, tf.float32)
    sigmoid_p = tf.nn.sigmoid(prediction_tensor)
    zeros = array_ops.zeros_like(sigmoid_p, dtype=sigmoid_p.dtype)
    pos_p_sub = array_ops.where(target_tensor > zeros, target_tensor - sigmoid_p, zeros)
    neg_p_sub = array_ops.where(target_tensor > zeros, zeros, sigmoid_p)
    if gamma != 0:
        per_entry_cross_ent = - alpha * (pos_p_sub ** gamma) * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0)) \
                              - (1 - alpha) * (neg_p_sub ** gamma) * tf.log( tf.clip_by_value(1.0 - sigmoid_p, 1e-8, 1.0))
    else:
        # ∵0**0=1 
        per_entry_cross_ent = - alpha * target_tensor * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0)) \
                              - (1 - alpha) * (1 - target_tensor) * tf.log(tf.clip_by_value(1.0 - sigmoid_p, 1e-8, 1.0))
    return per_entry_cross_ent

How to implemented it in faster-rcnn?

@ailias
Hi,thanks for your cool code.
but I have little confused about how to use it. I want implemented it in faster rcnn,should I just change the softmax into focal loss or others?can you tell me how to use it?
thanks so much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.