Code Monkey home page Code Monkey logo

symmetric_cross_entropy_for_noisy_labels's Introduction

Symmetric Learning (SL) via Symmetric Cross Entropy (SCE) loss

Code for ICCV2019 "Symmetric Cross Entropy for Robust Learning with Noisy Labels" https://arxiv.org/abs/1908.06112

Requirements

  • Python 3.5.2
  • Tensorflow 1.10.1
  • Keras 2.2.2

Usage

Simply run the code by python3 train_models.py

It can config with dataset, model, epoch, batchsize, noise_rate, symmetric or asymmetric type noise

The Pytorch reimplementation

The Pytorch version is implemented by Hanxun Huang. The code can be found here: https://github.com/HanxunHuangLemonBear/SCELoss-Reproduce

Citing this work

If you use this code in your work, please cite the accompanying paper:

@inproceedings{wang2019symmetric,
  title={Symmetric cross entropy for robust learning with noisy labels},
  author={Wang, Yisen and Ma, Xingjun and Chen, Zaiyi and Luo, Yuan and Yi, Jinfeng and Bailey, James},
  booktitle={IEEE International Conference on Computer Vision},
  year={2019}
}

symmetric_cross_entropy_for_noisy_labels's People

Contributors

xingjunm avatar yisenwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

symmetric_cross_entropy_for_noisy_labels's Issues

Is the accuracy reported in tables 1, 2, and 3 from the best epoch or the last epoch?

Dear authors,
Thanks for your interesting work.
Since we usually train neural network many epochs, and after each epoch's training, we may test the neural network and get a test accuracy. Usually, the highest test accuracy may occur in some epoch of the intermediate training process, not necessarily the last epoch.

In your paper, is the accuracy reported in tables 1, 2, and 3 from the best epoch or the last epoch?

How to regulate the alpha and the beta of SCEloss?

Hi, I want to apply SCEloss to other visual tasks, such as semantic segmentation.How should I consider tweaking the alpha and beta of SCEloss? And,I can't find the parameter A of the paper in the code.

tabular data/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

  1. Does it also work on tabular data?
  2. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Why Results reported in SL and D2L are different?

Hi @YisenWang @xingjunm
I happened to find that you are authors of both SL and D2L.
It is common that results are different if they are from different authors because of implementation and training details.
However, since you are authors of both SL and D2L, I am wondering why the reported results on CIFAR-100 with the same network ResNet44 are quite different.

CIFAR-100 (Results From D2L ICML 2018)
Noise Rate cross-entropy forward backward boot-hard boot-soft D2L
0% 68.20±0.2 68.54±0.3 68.48±0.3 68.31±0.2 67.89±0.2 68.60±0.3
20% 52.88±0.2 60.25±0.2 58.74±0.3 58.49±0.4 57.32±0.3 62.20±0.4
40% 42.85±0.2 51.27±0.3 45.42±0.2 44.41±0.1 41.87±0.1 52.01±0.3
60% 30.09±0.2 41.22±0.3 34.49±0.2 36.65±0.3 32.29±0.1 42.27±0.2

CIFAR-100 (Results From SL ICCV 2019)
Noise rate 0.0 0.2 0.4 0.6
CE 64.34 ± 0.37 59.26 ± 0.39 50.82 ± 0.19 25.39 ± 0.09
LSR 63.68 ± 0.54 58.83 ± 0.40 50.05 ± 0.31 24.68 ± 0.43
Bootstrap 63.26 ± 0.39 57.91 ± 0.42 48.17 ± 0.18 12.27 ± 0.11
Forward 63.99 ± 0.52 59.75 ± 0.34 53.13 ± 0.28 24.70 ± 0.26
D2L 64.60 ± 0.31 59.20 ± 0.43 52.01 ± 0.37 35.27 ± 0.28
GCE 64.43 ± 0.20 59.06 ± 0.27 53.25 ± 0.65 36.16 ± 0.74
SL 66.75 ± 0.04 60.01 ± 0.19 53.69 ± 0.07 41.47 ± 0.04

Results From SL ICCV 2019 are generally lower than Results From D2L IMCL 2018.

I would like to compare with your reported results. I am wondering which one to compare against?
Thanks.

Maybe there are something wrongs with your code.

image
To my understanding, you only want to clip "y_pred_1" and "y_true_2" for calculating \ell_rce, and expect to use "y_pred_2" and "y_true_1" to record the previous value to calculate \ell_ce. However, Python handles variables as references. In other words, when you clip "y_pred_1" and "y_true_2", the variables "y_pred_2" and "y_true_1" are also changed.

About alpha and beta

Why the settings of alpha\beta in the task of cifar10 and cifar100 are vastly differently?
For cifar10, the setting is α = 0.1 β = 1.0.
For cifar100, the setting is α = 6.0 β = 0.1.

How do you create asymmetric label noise on CIFAR-100?

import os

import numpy as np

np.random.seed(123)

NUM_CLASSES = {'mnist': 10, 'svhn': 10, 'cifar-10': 10, 'cifar-100': 100}

noise_ratio=40
n = noise_ratio/100.0

nb_subclasses = 5

def build_for_cifar100(size, noise):
""" random flip between two random classes.
"""
assert(noise >= 0.) and (noise <= 1.)

P = np.eye(size)
cls1, cls2 = np.random.choice(range(size), size=2, replace=False)
P[cls1, cls2] = noise
P[cls2, cls1] = noise
P[cls1, cls1] = 1.0 - noise
P[cls2, cls2] = 1.0 - noise

#assert_array_almost_equal(P.sum(axis=1), 1, 1)
return P

P = build_for_cifar100(nb_subclasses, n)

print(np.matrix(P))

[[1. 0. 0. 0. 0. ]
[0. 0.6 0. 0.4 0. ]
[0. 0. 1. 0. 0. ]
[0. 0.4 0. 0.6 0. ]
[0. 0. 0. 0. 1. ]]

I do not understand this noise-transition matrix.

Thanks.

Asymmetric Noise-

Hi what are optimal values for alpha beta in acse my noise rate is around 0.2 and type asymmetric

Multi-label scenario

By replacing the categorical cross entropy with binary cross entropy and a sigmoid activation can this be extended to be used for a multilabel case?

On the first attempt with this even with a very small beta value it doesn't seem to converge fast.
Is this expected behavior?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.