Code Monkey home page Code Monkey logo

Comments (5)

hellbell avatar hellbell commented on July 28, 2024

@rederxz
Thank you for your constructive opinion.
As pointed out in our paper, we tried two alphas (0.5 and 1.0) for mixup training and we choose alpha=1.0 because it shows better performance than 0.5. So we didn't try alpha below 0.5, but as you said, it would be worth finding the optimal alpha for mixup.

In fact, after doing some experiments, we find that the performance of Mixup and Cutmix could be close on ImageNet with preferred alpha settings respectively (0.2 and 1.0).

Could you give more detail about this? such as the accuracy, training settings, and so on.

Have you tried some related experiments and what do you think about it?

As I remember, in our training settings, CutMix was always better than mixup for ResNet variants regardless of alpha values.
However, for lightweight architectures like EfficientNet variants, mixup and CutMix shows similar performance gain.
So I think there should be a better strategy. Some recent works (e.g., https://arxiv.org/pdf/2012.12877.pdf) utilize mixup and CutMix at the same time for performance boosting.

from cutmix-pytorch.

rederyang avatar rederyang commented on July 28, 2024

Thanks for your reply! Some of our experiments are still in progress. I will upload detailed results in a few days.

from cutmix-pytorch.

rederyang avatar rederyang commented on July 28, 2024

Here are the results.

Our experiments:

model (Resolution) augmentation regularization batch size optimizer lr epochs lr_schedule wd acc Reference
ResNet_vd-50 160 ResizedCrop label smooth 0.1 mixup_batch alpha=0.2 256 * 4 SGD 0.1 * 4 200 Cosine 0.0001 78.58 /
ResNet_vd-50 160 ResizedCrop label smooth 0.1 mixup_batch alpha=1.0 256 * 4 SGD 0.1 * 4 200 Cosine 0.0001 77.55 /
ResNet_vd-50 160 ResizedCrop label smooth 0.1 cutmix_batch alpha=1.0 256 * 4 SGD 0.1 * 4 200 Cosine 0.0001 78.43 /
model (Resolution) augmentation regularization batch size optimizer lr epochs lr_schedule wd acc Reference
ResNet_vd-50 avd 160 ResizedCrop label smooth 0.1 cutmix_batch alpha=0.2 256 * 4 SGD 0.1 * 4 200 Cosine 0.0001 79.13 /
ResNet_vd-50 avd 160 ResizedCrop label smooth 0.1 cutmix_batch alpha=1.0 256 * 4 SGD 0.1 * 4 200 Cosine 0.0001 78.68 /
model (Resolution) augmentation regularization batch size optimizer lr epochs lr_schedule wd acc Reference
ResNet_vd-50 224 ResizedCrop mixup_batch alpha=0.2 256 * 4 SGD 0.1 * 4 300 Cosine 0.0001 79.00 /
ResNet_vd-50 224 ResizedCrop mixup_batch alpha=1.0 256 * 4 SGD 0.1 * 4 300 Cosine 0.0001 78.44 /
ResNet_vd-50 224 ResizedCrop cutmix_batch alpha=0.2 256 * 4 SGD 0.1 * 4 300 Cosine 0.0001 79.15 /
ResNet_vd-50 224 ResizedCrop cutmix_batch alpha=1.0 256 * 4 SGD 0.1 * 4 300 Cosine 0.0001 79.17 /

Results from PaddleClas

model (Resolution) augmentation regularization batch size optimizer lr epochs lr_schedule wd acc Reference
ResNet-50 224 ResizedCrop mixup_batch alpha=0.2 256 SGD 0.1 300 Cosine 0.0001 0.7828 page
ResNet-50 224 ResizedCrop cutmix_batch alpha=0.2 256 SGD 0.1 300 Cosine 0.0001 0.7839 page

Experiments in the paper of CutMix

model (Resolution) augmentation regularization batch size optimizer lr epochs lr_schedule wd acc Reference
ResNet-50 224 ResizedCrop mixup_batch alpha=1.0 256 SGD 0.1 300 Step 0.0001 0.7742 cutmix paper
ResNet-50 224 ResizedCrop cutmix_batch alpha=1.0 256 SGD 0.1 300 Step 0.0001 0.7860 cutmix paper

We can see that mixup's performence is better when alpha equals 0.2 than when alpha equals 1.0. Also, the gap between mixup and cutmix becomes smaller when alpha equals 0.2, which can also be confirmed by the results from paddleClas.

from cutmix-pytorch.

hellbell avatar hellbell commented on July 28, 2024

Thank you for sharing the results! They are great experiments.
Given your results, I agree that alpha should be 0.2 for mixup on ImageNet experiments. If there's a chance to revise or extend our paper, this information would be very useful :)
At the same time, I'm curious about the CutMix result with alpha=1.0 on PaddleClas table. I guess its performance would be better than alpha=0.2.
Thanks!

from cutmix-pytorch.

rederyang avatar rederyang commented on July 28, 2024

I agree that alpha influences performances of cutmix and mixup, and may have greater impact on mixup on ImageNet.
Thanks for your reply.:smiley:

from cutmix-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.