sail-sg / inceptionnext Goto Github PK

View Code? Open in Web Editor NEW

194.0 194.0 15.0 57 KB

InceptionNeXt: When Inception Meets ConvNeXt (CVPR 2024)

Home Page: https://arxiv.org/abs/2303.16900

License: Apache License 2.0

Python 95.14% Dockerfile 0.08% Shell 4.78%

convolutional-neural-networks

inceptionnext's People

Contributors

Stargazers

Watchers

Forkers

whuhxb guttappa1238 evdcush hhhhlkf jasondu1993 gubei1998 jayagami powermano dl-cnn sakcnj park123man 116022017144 object-detection-01 keyman9848 nikeshdevkota

inceptionnext's Issues

为什么7x7的深度卷积可以分解成3x3和1x11和11x1的深度卷积？

您好！我想请问下，就是我标题中的问题“为什么7x7的深度卷积可以分解成3x3和1x11和11x1的深度卷积？” 这种分解是有严格数学推导吗？可以推导出分解前和分解后，二者的卷积核大小相同？还是因为效果好才这样分解？If you don't understand my question, I can ask it in English！

A error occurred when I load the pretrained weights

When I run thre code below:

model = inceptionnext_base(pretrained=True)
inputs = torch.randn((1, 3, 640, 640))
for i in model(inputs):
print(i.size())

RuntimeError: Error(s) in loading state_dict for MetaNeXt:
Unexpected key(s) in state_dict: "head.fc1.weight", "head.fc1.bias", "head.norm.weight", "head.norm.bias", "head.fc2.weight", "head.fc2.bias".

distributed_train.sh

Nice work!
I would like to fine tuning, but I can't find the 'distributed_train.sh' file"

Use inceptionnext as my backbone

If I use inceptionnext as my backbone, need I do some special process?

A Typo

Dear author, Thankyou for your impressive work. I notice a minor possible typo in your paper (arxiv version) in paragraph 3.2, "Complexity" part, line 3, where you wrote "incetion depthwise convolution" instead of "inception depthwise convolution". Maybe you can correct this mistake in your final version. Thankyou again for your contribution!

train my own datasets, acc is very low

Hi, thanks for your great work. I used my own datasets to train the model, but the acc is very low. When I use resnet-50/convnext-tiny as backbone to train, the acc can achieve at least 90%+, but use the inceptionnext as backbone, the acc is only 67.9% as following images

And this is my train scripts as following

since of my class number is only 4, I change and add some code as followings

Could you please tell me why this happen and give me some advice ?

Why do I report an error parameter mismatch when I predict with saved parameters?And the accuracy of the prediction is very low, is it a problem with the import parameters?

Parameter not found in model: epoch
Parameter not found in model: arch
Parameter not found in model: state_dict
Parameter not found in model: optimizer
Parameter not found in model: version
Parameter not found in model: args
Parameter not found in model: metric

---------------------my code-----------------------------
def test(args):
model = create_model(
args.model,
pretrained=args.pretrained,
num_classes=args.num_classes,
in_chans=3,
global_pool=args.gp,
scriptable=args.torchscript)
# 文件夹路径
folder_path = '../apple/test'

# 加载.pth文件
pretrained_path = './model_best.pth.tar'
pretrained_dict = torch.load(pretrained_path)

# 检查模型参数形状并进行匹配
model_dict = model.state_dict()
for name, param in pretrained_dict.items():
    if name in model_dict:
        if param.shape == model_dict[name].shape:
            model_dict[name] = param
        else:
            print(f'Parameter shape mismatch: {name}')
    else:
        print(f'Parameter not found in model: {name}')
# 创建模型实例并加载权重
device = torch.device('cuda')
model.load_state_dict(model_dict, strict=False)
model.to(device)
model.eval()
transform = transforms.Compose([transforms.ToTensor(), transforms.Resize((224, 224))])
vector = []
out = []
for i in range(9):
    vector.append(torch.full((1,), i, dtype=torch.int))
# 进行预测
# 遍历文件夹中的文件
for file_name in os.listdir(folder_path):
    file_path = os.path.join(folder_path, file_name)
    # 确保是图片文件
    if file_path.endswith('.jpg') or file_path.endswith('.png'):
        # 使用PIL库读取图片
        image = Image.open(file_path)
        image = transform(image)
        image = image.to(device)
        image = image.unsqueeze(0)
        output = model(image)
        # 将预测值保存为.csv文件
        for i in range(9):
            output_argmax = output.argmax(1)
            vector_i = vector[i].to(output_argmax.device)
            if output_argmax == vector_i:
                out.append('d{}'.format(i + 1))

df = pd.DataFrame()
df['uuid'] = [file_name for file_name in os.listdir(folder_path) if file_name.endswith('.jpg') or file_name.endswith('.png')]
df['label'] = [i for i in out]
df.to_csv('predictions.csv', index=False)

About decomposed large kernel DWConv

I found similar decompose idea from SegNeXt. In that paper, authors stated that: "On the other hand, there are some strip-like objects, such as human and telephone pole in the segmentation scenes. Thus, strip convolution can be a complement of grid convolutions and helps extract strip-like features". Do you find similar phenomenon on downstream task? Can you visualize if a InceptionNeXt block capture strip-like features? I tried using this strip convolution idea in semantic segmentation task but didn't find any success 😢

May I ask if the relevant segmentation and detection code can be provided?

May I ask if the relevant segmentation and detection code can be provided?I made some modifications to the segmentation according to the format of the poolformer, but the results were not satisfactory. Do you have any details

能迁移到yolov8的主干中吗，如何迁移

您好，感谢您的工作， inceptionnext是一个即插即用的模块吗

我看代码感觉其是一个整体架构，不像是即插即用的模块

您好，我如何把inceptionnext加到我自己的baseline上

Remove global average pooling out of "forward_head"

Great job on your work! I have a small suggestion regarding the code in inceptionnext.py. It would be more convenient if the line "x = x.mean((2, 3)) # global average pooling" was moved from the "self.forward_head" function to the end of the "self.forward_features" function. This way, we can directly command the line "x = self.forward_head(x)" when we have our own classification layer, without needing to keep the "MlpHead" class and command the lines after "x = x.mean((2, 3)) # global average pooling". The current setup is a little inconvenient.

train error

hi, thanks for your great work. When I use your code to train my datasets, there is a problem as followings:
there is no module named utils, and I cannot found a file named utils.py

如何在自己的电脑上运行您的代码

尊敬的作者，您好，我是一个研0的新生，老师让我用您的代码去实现特征提取和故障分类，但是我下载您的代码在pycharm上运行时发现有错误，错误如下所示，请问这种错误应该如何解决

D:\anaconda3\envs\pytorch\python.exe F:\算法总结\特征提取\InceptionNext\inceptionnext-main\models\inceptionnext.py
Traceback (most recent call last):
File "F:\算法总结\特征提取\InceptionNext\inceptionnext-main\models\inceptionnext.py", line 12, in
from timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
File "D:\anaconda3\envs\pytorch\Lib\site-packages\timm_init_.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "D:\anaconda3\envs\pytorch\Lib\site-packages\timm\models_init_.py", line 28, in
from .maxxvit import *
File "D:\anaconda3\envs\pytorch\Lib\site-packages\timm\models\maxxvit.py", line 216, in
@DataClass
^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 1230, in dataclass
return wrap(cls)
^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 1220, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory

inceptionnext performance

Hello, I tried to use inceptionnext as the encoder, but the experiment results are not ideal ( The performance is even worse than convnext, I think maybe I made a mistake), can you help me to see if there is a problem with my code?

Since the encoder is connected to the decoder at each layer, I need to extract the results of each layer.

original:

def forward(self, x):
    x = self.forward_features(x)
    x = self.forward_head(x)  # I removed this part of the code because of the encoder
    return x

I modified:

def forward(self, x):
    y = x
    y = self.stem(y)
    y = self.stages[0] (y)
    x1 = y
    
    y = self.stages[1] (y)
    x2 = y
    
    y = self.stages[2] (y)
    x3 = y
    
    y = self.stages[3] (y)
    x4 = y
    
    return x1, x2, x3, x4

Error on pre-trained weights

Hello, I tried to use inceptionnext as my backbone, the network is configured, but when loading of the pre-trained weights is wrong:
‘TypeError: a bytes-like object is required, not '_io.BufferedReader'
Can you give me a little help?