Keras EfficientNetV2

Keras EfficientNetV2

Summary

My own keras implementation of Official efficientnetv2. Article arXiv 2104.00298 EfficientNetV2: Smaller Models and Faster Training by Mingxing Tan, Quoc V. Le.
h5 model weights converted from official publication.
effv2-t-imagenet.h5 model weights converted from Github rwightman/pytorch-image-models. which claimed both faster and better accuracy than b3. Please notice that PyTorch using different bn_epsilon and padding strategy, so this converted output is a little different from PyTorch version.

V2 Model	Params	Top1	Input	ImageNet21K	Imagenet21k-ft1k	Imagenet
EffV2B0	7.1M	78.7	224	v2b0-21k.h5	v2b0-21k-ft1k.h5	v2b0-imagenet.h5
EffV2B1	8.1M	79.8	240	v2b1-21k.h5	v2b1-21k-ft1k.h5	v2b1-imagenet.h5
EffV2B2	10.1M	80.5	260	v2b2-21k.h5	v2b2-21k-ft1k.h5	v2b2-imagenet.h5
EffV2B3	14.4M	82.1	300	v2b3-21k.h5	v2b3-21k-ft1k.h5	v2b3-imagenet.h5
EffV2T	13.6M	82.5	320			v2t-imagenet.h5
EffV2S	21.5M	84.9	384	v2s-21k.h5	v2s-21k-ft1k.h5	v2s-imagenet.h5
EffV2M	54.1M	86.2	480	v2m-21k.h5	v2m-21k-ft1k.h5	v2m-imagenet.h5
EffV2L	119.5M	86.9	480	v2l-21k.h5	v2l-21k-ft1k.h5	v2l-imagenet.h5
EffV2XL	206.8M	87.2	512	v2xl-21k.h5	v2xl-21k-ft1k.h5

EfficientNetV1 noisy_student models from Github tensorflow/tpu/efficientnet. Paper PDF 1911.04252 Self-training with Noisy Student improves ImageNet classification. Parameter include_preprocessing=False is added, and the default False value makes expecting input value in range [-1, 1], same with EfficientNetV2. Default pretrained is noisy_student.

V1 Model	Params	Top1	Input	noisy_student	Imagenet
EffV1B0	5.3M	78.8	224	v1-b0-noisy_student.h5	v1-b0-imagenet.h5
EffV1B1	7.8M	81.5	240	v1-b1-noisy_student.h5	v1-b1-imagenet.h5
EffV1B2	9.1M	82.4	260	v1-b2-noisy_student.h5	v1-b2-imagenet.h5
EffV1B3	12.2M	84.1	300	v1-b3-noisy_student.h5	v1-b3-imagenet.h5
EffV1B4	19.3M	85.3	380	v1-b4-noisy_student.h5	v1-b4-imagenet.h5
EffV1B5	30.4M	86.1	456	v1-b5-noisy_student.h5	v1-b5-imagenet.h5
EffV1B6	43.0M	86.4	528	v1-b6-noisy_student.h5	v1-b6-imagenet.h5
EffV1B7	66.3M	86.9	600	v1-b7-noisy_student.h5	v1-b7-imagenet.h5
EffV1L2	480.3M	88.4	800	v1-l2-noisy_student.h5

Usage

This repo can be installed as a pip package, or just git clone it.

pip install -U keras-efficientnet-v2
# Or
pip install -U git+https://github.com/leondgarse/keras_efficientnet_v2

Define model and load pretrained weights Parameter pretrained is added in value [None, "imagenet", "imagenet21k", "imagenet21k-ft1k"], default is imagenet. Model input value should be in range [-1, 1].

# Will download and load `imagenet` pretrained weights.
# Model weight is loaded with `by_name=True, skip_mismatch=True`.
import keras_efficientnet_v2
model = keras_efficientnet_v2.EfficientNetV2S(pretrained="imagenet")

# Run prediction
import tensorflow as tf
from tensorflow import keras
from skimage.data import chelsea
imm = tf.image.resize(chelsea(), model.input_shape[1:3]) # Chelsea the cat
pred = model(tf.expand_dims(imm / 128. - 1., 0)).numpy()
print(keras.applications.imagenet_utils.decode_predictions(pred)[0])
# [('n02124075', 'Egyptian_cat', 0.8642886), ('n02123159', 'tiger_cat', 0.030793495), ...]

Or download h5 model and load directly

mm = keras.models.load_model('efficientnetv2-b3-21k-ft1k.h5')

For "imagenet21k" pre-trained model, actual num_classes is 21843.

Exclude model top layers by set num_classes=0.

import keras_efficientnet_v2
model = keras_efficientnet_v2.EfficientNetV2B0(dropout=1e-6, num_classes=0, pretrained="imagenet21k")
print(model.output_shape)
# (None, 7, 7, 1280)

model.save('efficientnetv2-b0-21k-notop.h5')

Use dynamic input resolution by set input_shape=(None, None, 3).

import keras_efficientnet_v2
model = keras_efficientnet_v2.EfficientNetV2M(input_shape=(None, None, 3), drop_connect_rate=0.2, num_classes=0, pretrained="imagenet21k-ft1k")

print(model(np.ones([1, 224, 224, 3])).shape)
# (1, 7, 7, 1280)
print(model(np.ones([1, 512, 512, 3])).shape)
# (1, 16, 16, 1280)

include_preprocessing set True will add pre-processing Rescale + Normalization after Input. Means using input value in range [0, 255]. Default value False means in range [-1, 1]. Works both for EfficientNetV2 and EfficientNetV1.

import keras_efficientnet_v2
model = keras_efficientnet_v2.EfficientNetV1B4(pretrained="noisy_student", include_preprocessing=True)

from skimage.data import chelsea
imm = tf.image.resize(chelsea(), model.input_shape[1:3]) # Chelsea the cat
pred = model(tf.expand_dims(imm, 0)).numpy()  # value in range [0, 255]
print(keras.applications.imagenet_utils.decode_predictions(pred)[0])
# [('n02124075', 'Egyptian_cat', 0.68414235), ('n02123159', 'tiger_cat', 0.04486668), ...]

Training detail from article

Training configures, Eval size is used as the default input_shape for each model type.

Model	Train size	Eval size	Dropout	Randaug	Mixup
EffV2B0	192	224	0.2	0	0
EffV2B1	192	240	0.2	0	0
EffV2B2	208	260	0.3	0	0
EffV2B3	240	300	0.3	0	0
EffV2S	300	384	0.2	10	0
EffV2M	384	480	0.3	15	0.2
EffV2L	384	480	0.4	20	0.5
EffV2XL	384	512	0.4	20	0.5

EfficientNetV2-S architecture

Stage	Operator	Stride	#Channels	#Layers
0	Conv3x3	2	24	1
1	Fused-MBConv1, k3x3	1	24	2
2	Fused-MBConv4, k3x3	2	48	4
3	Fused-MBConv4, k3x3	2	64	4
4	MBConv4, k3x3, SE0.25	2	128	6
5	MBConv6, k3x3, SE0.25	1	160	9
6	MBConv6, k3x3, SE0.25	2	256	15
7	Conv1x1 & Pooling & FC	-	1280	1

Progressive training settings for EfficientNetV2

S min S max M min M max L min M max

Image Size 128 300 128 380 128 380

RandAugment 5 15 5 20 5 25

Mixup alpha 0 0 0 0.2 0 0.4

Dropout rate 0.1 0.3 0.1 0.4 0.1 0.5
Imagenet training detail
- RMSProp optimizer with decay 0.9 and momentum 0.9
- batch norm momentum 0.99; weight decay 1e-5
- Each model is trained for 350 epochs with total batch size 4096
- Learning rate is first warmed up from 0 to 0.256, and then decayed by 0.97 every 2.4 epochs
- We use exponential moving average with 0.9999 decay rate
- RandAugment (Cubuk et al., 2020)
- Mixup (Zhang et al., 2018)
- Dropout (Srivastava et al., 2014)
- and stochastic depth (Huang et al., 2016) with 0.8 survival probability

	S min	S max	M min	M max	L min	M max
Image Size	128	300	128	380	128	380
RandAugment	5	15	5	20	5	25
Mixup alpha	0	0	0	0.2	0	0.4
Dropout rate	0.1	0.3	0.1	0.4	0.1	0.5

Detailed conversion procedure

convert_effnetv2_model.py is a modified version of the orignal effnetv2_model.py. Check detail by vimdiff convert_effnetv2_model.py ../automl/efficientnetv2/effnetv2_model.py
- Delete some names, as they may cause confliction in keras.
- Use .call directly calling se modules and other blocks, so they will not be blocks in model.summary()
- Just use Add layer instead of utils.drop_connect, as when is_training=False, utils.drop_connect functions like Add.
- Add a num_classes parameter outside of mconfig.
- Add __main__ part, which makes this can be run as a script. Refer to it for converting detail.

Depends on official repo

../
├── automl  # Official repo
├── keras_efficientnet_v2  # This one

Procedure

# See help info
CUDA_VISIBLE_DEVICES='-1' python convert_effnetv2_model.py -h

# Convert by specific model_type and dataset type
CUDA_VISIBLE_DEVICES='-1' python convert_effnetv2_model.py -m b0 -d imagenet21k

# Convert by specific model_type and all its datasets ['imagenet', 'imagenet21k', 'imagenetft']
CUDA_VISIBLE_DEVICES='-1' python convert_effnetv2_model.py -m s -d all

# Convert all model_types and and all datasets
CUDA_VISIBLE_DEVICES='-1' python convert_effnetv2_model.py -m all -d all

Progressive train test on cifar10

Colab efficientnetV2_basic_test.ipynb

import keras_efficientnet_v2
from tensorflow import keras
from keras_efficientnet_v2 import progressive_train_test

model = keras_efficientnet_v2.EfficientNetV2S(input_shape=(None, None, 3), num_classes=10, classifier_activation='softmax', dropout=0.1)
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

hhs = progressive_train_test.progressive_with_dropout_randaug(
    model,
    data_name="cifar10",
    lr_scheduler=None,
    total_epochs=36,
    batch_size=64,
    dropout_layer=-2,
    target_shapes=[128, 160, 192, 224], # [128, 185, 242, 300] for final shape (300, 300)
    dropouts=[0.1, 0.2, 0.3, 0.4],
    magnitudes=[5, 8, 12, 15],
)

with open("history_ev2s_imagenet_progressive_224.json", "w") as ff:
    json.dump(hhs, ff)

mrhagchwh / keras_efficientnet_v2 Goto Github PK