Hi, I am trying to train a convext on CIFAR-10 for a research projec

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

With image size 32, try the parameters mentioned here <a class="issue-link js-issue-li

I did try changing the Conv layer (<a href="https://github.com/facebookresearch/ConvNe

Hyperparameter setting for training from scratch on CIFAR-10 about convnext HOT 9 OPEN

Yuancheng-Xu commented on August 25, 2024

Hyperparameter setting for training from scratch on CIFAR-10

from convnext.

Comments (9)

Yuancheng-Xu commented on August 25, 2024 1

Thanks a lot!

from convnext.

slerman12 commented on August 25, 2024

I was also wondering about this. It seems the 32x32 size of CIFAR-10 is incompatible with this model due to the down-sampling layers.

from convnext.

shamikbose commented on August 25, 2024

@Yuancheng-Xu It seems like it can. The downsampling layers should be set to a smaller kernel and stride size (2 and 2 respectively). Without this, the output of the downsampling layers is effectively the same size as the kernel.
In addition, you might want to choose a smaller kernel and padding size for the Block convolutional layers
Here's a notebook showing the training progress https://juliusruseckas.github.io/ml/convnext-cifar10.html

from convnext.

shamikbose commented on August 25, 2024

@Yuancheng-Xu I managed to get accuracy to 87% by making a few changes to the code in the link above. Basic changes are mentioned in this repository https://github.com/shamikbose/Fujitsu_Assessment
Main changes were as follows:

The downsampling convolutional layers were modified (4x4 -> 2x2) for the smaller image size in the dataset

This improved accuracy from 70% to 80%

Keeping CIFAR-10 training recipes in mind, the architecture was modified to be a 3-block architecture instead of a 4-block one

This improved accuracy from 80% to 85%

Kernel size was changed (7 -> 3)

This improved accuracy from 85% to 87%

from convnext.

iamsh4shank commented on August 25, 2024

Hey @shamikbose, I tried training the ImageNet100 dataset for custom input_size = 32, but the accuracy that I am getting is too low. What could I change in the architecture (I tried with making the kernel and stride small)? Any other approach that might help me to get good accuracy?

from convnext.

shamikbose commented on August 25, 2024

@iamsh4shank The parameters used for ImageNet100 are mentioned in the paper. You should be able to reproduce it using those values.

from convnext.

iamsh4shank commented on August 25, 2024

Actually ig it was for input_size 224 but on changing it to 32 I get accuracy really low

from convnext.

shamikbose commented on August 25, 2024

With image size 32, try the parameters mentioned here #134 (comment)

from convnext.

iamsh4shank commented on August 25, 2024

I did try changing the Conv layer (https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py#L28) with kernel size 3 and padding 1. Also, I changed the downsampling layer (https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py#L74) with kernel size 2 and stride 2. It did not change the accuracy much. I am getting test accuracy like 4-5 percent

from convnext.

Hyperparameter setting for training from scratch on CIFAR-10 about convnext HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent