yisenwang / symmetric_cross_entropy_for_noisy_labels Goto Github PK

View Code? Open in Web Editor NEW

168.0 8.0 32.0 27 KB

Code for ICCV2019 "Symmetric Cross Entropy for Robust Learning with Noisy Labels"

Home Page: https://arxiv.org/abs/1908.06112

Python 100.00%

noisy-labels iccv2019 cross-entropy-loss

symmetric_cross_entropy_for_noisy_labels's Introduction

Symmetric Learning (SL) via Symmetric Cross Entropy (SCE) loss

Code for ICCV2019 "Symmetric Cross Entropy for Robust Learning with Noisy Labels" https://arxiv.org/abs/1908.06112

Requirements

Python 3.5.2
Tensorflow 1.10.1
Keras 2.2.2

Usage

Simply run the code by python3 train_models.py

It can config with dataset, model, epoch, batchsize, noise_rate, symmetric or asymmetric type noise

The Pytorch reimplementation

The Pytorch version is implemented by Hanxun Huang. The code can be found here: https://github.com/HanxunHuangLemonBear/SCELoss-Reproduce

Citing this work

If you use this code in your work, please cite the accompanying paper:

@inproceedings{wang2019symmetric,
  title={Symmetric cross entropy for robust learning with noisy labels},
  author={Wang, Yisen and Ma, Xingjun and Chen, Zaiyi and Luo, Yuan and Yi, Jinfeng and Bailey, James},
  booktitle={IEEE International Conference on Computer Vision},
  year={2019}
}

symmetric_cross_entropy_for_noisy_labels's People

Contributors

Stargazers

Watchers

symmetric_cross_entropy_for_noisy_labels's Issues

Is the accuracy reported in tables 1, 2, and 3 from the best epoch or the last epoch?

Dear authors,
Thanks for your interesting work.
Since we usually train neural network many epochs, and after each epoch's training, we may test the neural network and get a test accuracy. Usually, the highest test accuracy may occur in some epoch of the intermediate training process, not necessarily the last epoch.

In your paper, is the accuracy reported in tables 1, 2, and 3 from the best epoch or the last epoch?

How to regulate the alpha and the beta of SCEloss?

Hi, I want to apply SCEloss to other visual tasks, such as semantic segmentation.How should I consider tweaking the alpha and beta of SCEloss? And,I can't find the parameter A of the paper in the code.

tabular data/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

Does it also work on tabular data?
Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Why Results reported in SL and D2L are different?

Hi @YisenWang @xingjunm
I happened to find that you are authors of both SL and D2L.
It is common that results are different if they are from different authors because of implementation and training details.
However, since you are authors of both SL and D2L, I am wondering why the reported results on CIFAR-100 with the same network ResNet44 are quite different.

CIFAR-100 (Results From D2L ICML 2018)
Noise Rate cross-entropy forward backward boot-hard boot-soft D2L
0% 68.20±0.2 68.54±0.3 68.48±0.3 68.31±0.2 67.89±0.2 68.60±0.3
20% 52.88±0.2 60.25±0.2 58.74±0.3 58.49±0.4 57.32±0.3 62.20±0.4
40% 42.85±0.2 51.27±0.3 45.42±0.2 44.41±0.1 41.87±0.1 52.01±0.3
60% 30.09±0.2 41.22±0.3 34.49±0.2 36.65±0.3 32.29±0.1 42.27±0.2

CIFAR-100 (Results From SL ICCV 2019)
Noise rate 0.0 0.2 0.4 0.6
CE 64.34 ± 0.37 59.26 ± 0.39 50.82 ± 0.19 25.39 ± 0.09
LSR 63.68 ± 0.54 58.83 ± 0.40 50.05 ± 0.31 24.68 ± 0.43
Bootstrap 63.26 ± 0.39 57.91 ± 0.42 48.17 ± 0.18 12.27 ± 0.11
Forward 63.99 ± 0.52 59.75 ± 0.34 53.13 ± 0.28 24.70 ± 0.26
D2L 64.60 ± 0.31 59.20 ± 0.43 52.01 ± 0.37 35.27 ± 0.28
GCE 64.43 ± 0.20 59.06 ± 0.27 53.25 ± 0.65 36.16 ± 0.74
SL 66.75 ± 0.04 60.01 ± 0.19 53.69 ± 0.07 41.47 ± 0.04

Results From SL ICCV 2019 are generally lower than Results From D2L IMCL 2018.

I would like to compare with your reported results. I am wondering which one to compare against?
Thanks.

Maybe there are something wrongs with your code.

To my understanding, you only want to clip "y_pred_1" and "y_true_2" for calculating \ell_rce, and expect to use "y_pred_2" and "y_true_1" to record the previous value to calculate \ell_ce. However, Python handles variables as references. In other words, when you clip "y_pred_1" and "y_true_2", the variables "y_pred_2" and "y_true_1" are also changed.

Is symmetric_cross_entropy suitable for binary classification issue?

Is symmetric_cross_entropy suitable for binary classification issue?
If so, someone can give some hints to modify the symmetric_cross_entropy function?

About alpha and beta

Why the settings of alpha\beta in the task of cifar10 and cifar100 are vastly differently?
For cifar10, the setting is α = 0.1 β = 1.0.
For cifar100, the setting is α = 6.0 β = 0.1.

How do you create asymmetric label noise on CIFAR-100?

import os

import numpy as np

np.random.seed(123)

NUM_CLASSES = {'mnist': 10, 'svhn': 10, 'cifar-10': 10, 'cifar-100': 100}

noise_ratio=40
n = noise_ratio/100.0

nb_subclasses = 5

def build_for_cifar100(size, noise):
""" random flip between two random classes.
"""
assert(noise >= 0.) and (noise <= 1.)

P = np.eye(size)
cls1, cls2 = np.random.choice(range(size), size=2, replace=False)
P[cls1, cls2] = noise
P[cls2, cls1] = noise
P[cls1, cls1] = 1.0 - noise
P[cls2, cls2] = 1.0 - noise

#assert_array_almost_equal(P.sum(axis=1), 1, 1)
return P

P = build_for_cifar100(nb_subclasses, n)

print(np.matrix(P))

[[1. 0. 0. 0. 0. ]
[0. 0.6 0. 0.4 0. ]
[0. 0. 1. 0. 0. ]
[0. 0.4 0. 0.6 0. ]
[0. 0. 0. 0. 1. ]]

I do not understand this noise-transition matrix.

Thanks.