sail-sg / inceptionnext Goto Github PK
View Code? Open in Web Editor NEWInceptionNeXt: When Inception Meets ConvNeXt (CVPR 2024)
Home Page: https://arxiv.org/abs/2303.16900
License: Apache License 2.0
InceptionNeXt: When Inception Meets ConvNeXt (CVPR 2024)
Home Page: https://arxiv.org/abs/2303.16900
License: Apache License 2.0
您好!我想请问下,就是我标题中的问题“为什么7x7的深度卷积可以分解成3x3和1x11和11x1的深度卷积?” 这种分解是有严格数学推导吗?可以推导出分解前和分解后,二者的卷积核大小相同?还是因为效果好才这样分解?If you don't understand my question, I can ask it in English!
When I run thre code below:
model = inceptionnext_base(pretrained=True)
inputs = torch.randn((1, 3, 640, 640))
for i in model(inputs):
print(i.size())
RuntimeError: Error(s) in loading state_dict for MetaNeXt:
Unexpected key(s) in state_dict: "head.fc1.weight", "head.fc1.bias", "head.norm.weight", "head.norm.bias", "head.fc2.weight", "head.fc2.bias".
Nice work!
I would like to fine tuning, but I can't find the 'distributed_train.sh' file"
If I use inceptionnext as my backbone, need I do some special process?
Dear author, Thankyou for your impressive work. I notice a minor possible typo in your paper (arxiv version) in paragraph 3.2, "Complexity" part, line 3, where you wrote "incetion depthwise convolution" instead of "inception depthwise convolution". Maybe you can correct this mistake in your final version. Thankyou again for your contribution!
Hi, thanks for your great work. I used my own datasets to train the model, but the acc is very low. When I use resnet-50/convnext-tiny as backbone to train, the acc can achieve at least 90%+, but use the inceptionnext as backbone, the acc is only 67.9% as following images
And this is my train scripts as following
since of my class number is only 4, I change and add some code as followings
Could you please tell me why this happen and give me some advice ?
Parameter not found in model: epoch
Parameter not found in model: arch
Parameter not found in model: state_dict
Parameter not found in model: optimizer
Parameter not found in model: version
Parameter not found in model: args
Parameter not found in model: metric
---------------------my code-----------------------------
def test(args):
model = create_model(
args.model,
pretrained=args.pretrained,
num_classes=args.num_classes,
in_chans=3,
global_pool=args.gp,
scriptable=args.torchscript)
# 文件夹路径
folder_path = '../apple/test'
# 加载.pth文件
pretrained_path = './model_best.pth.tar'
pretrained_dict = torch.load(pretrained_path)
# 检查模型参数形状并进行匹配
model_dict = model.state_dict()
for name, param in pretrained_dict.items():
if name in model_dict:
if param.shape == model_dict[name].shape:
model_dict[name] = param
else:
print(f'Parameter shape mismatch: {name}')
else:
print(f'Parameter not found in model: {name}')
# 创建模型实例并加载权重
device = torch.device('cuda')
model.load_state_dict(model_dict, strict=False)
model.to(device)
model.eval()
transform = transforms.Compose([transforms.ToTensor(), transforms.Resize((224, 224))])
vector = []
out = []
for i in range(9):
vector.append(torch.full((1,), i, dtype=torch.int))
# 进行预测
# 遍历文件夹中的文件
for file_name in os.listdir(folder_path):
file_path = os.path.join(folder_path, file_name)
# 确保是图片文件
if file_path.endswith('.jpg') or file_path.endswith('.png'):
# 使用PIL库读取图片
image = Image.open(file_path)
image = transform(image)
image = image.to(device)
image = image.unsqueeze(0)
output = model(image)
# 将预测值保存为.csv文件
for i in range(9):
output_argmax = output.argmax(1)
vector_i = vector[i].to(output_argmax.device)
if output_argmax == vector_i:
out.append('d{}'.format(i + 1))
df = pd.DataFrame()
df['uuid'] = [file_name for file_name in os.listdir(folder_path) if file_name.endswith('.jpg') or file_name.endswith('.png')]
df['label'] = [i for i in out]
df.to_csv('predictions.csv', index=False)
I found similar decompose idea from SegNeXt. In that paper, authors stated that: "On the other hand, there are some strip-like objects, such as human and telephone pole in the segmentation scenes. Thus, strip convolution can be a complement of grid convolutions and helps extract strip-like features". Do you find similar phenomenon on downstream task? Can you visualize if a InceptionNeXt block capture strip-like features? I tried using this strip convolution idea in semantic segmentation task but didn't find any success 😢
May I ask if the relevant segmentation and detection code can be provided?I made some modifications to the segmentation according to the format of the poolformer, but the results were not satisfactory. Do you have any details
我看代码感觉其是一个整体架构,不像是即插即用的模块
Great job on your work! I have a small suggestion regarding the code in inceptionnext.py. It would be more convenient if the line "x = x.mean((2, 3)) # global average pooling" was moved from the "self.forward_head" function to the end of the "self.forward_features" function. This way, we can directly command the line "x = self.forward_head(x)" when we have our own classification layer, without needing to keep the "MlpHead" class and command the lines after "x = x.mean((2, 3)) # global average pooling". The current setup is a little inconvenient.
尊敬的作者,您好,我是一个研0的新生,老师让我用您的代码去实现特征提取和故障分类,但是我下载您的代码在pycharm上运行时发现有错误,错误如下所示,请问这种错误应该如何解决
D:\anaconda3\envs\pytorch\python.exe F:\算法总结\特征提取\InceptionNext\inceptionnext-main\models\inceptionnext.py
Traceback (most recent call last):
File "F:\算法总结\特征提取\InceptionNext\inceptionnext-main\models\inceptionnext.py", line 12, in
from timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
File "D:\anaconda3\envs\pytorch\Lib\site-packages\timm_init_.py", line 2, in
from .models import create_model, list_models, is_model, list_modules, model_entrypoint,
File "D:\anaconda3\envs\pytorch\Lib\site-packages\timm\models_init_.py", line 28, in
from .maxxvit import *
File "D:\anaconda3\envs\pytorch\Lib\site-packages\timm\models\maxxvit.py", line 216, in
@DataClass
^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 1230, in dataclass
return wrap(cls)
^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 1220, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\anaconda3\envs\pytorch\Lib\dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory
Hello, I tried to use inceptionnext as the encoder, but the experiment results are not ideal ( The performance is even worse than convnext, I think maybe I made a mistake), can you help me to see if there is a problem with my code?
Since the encoder is connected to the decoder at each layer, I need to extract the results of each layer.
original:
def forward(self, x):
x = self.forward_features(x)
x = self.forward_head(x) # I removed this part of the code because of the encoder
return x
I modified:
def forward(self, x):
y = x
y = self.stem(y)
y = self.stages[0] (y)
x1 = y
y = self.stages[1] (y)
x2 = y
y = self.stages[2] (y)
x3 = y
y = self.stages[3] (y)
x4 = y
return x1, x2, x3, x4
Hello, I tried to use inceptionnext as my backbone, the network is configured, but when loading of the pre-trained weights is wrong:
‘TypeError: a bytes-like object is required, not '_io.BufferedReader'
Can you give me a little help?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.