error log | 日志或报错信息 | ログ context | 编译/运行环境 | バックグラウンド <p dir

I convert onnx to ncnn successfully, but all my inference is all nan. Eg, the output of net.extract() is all nan. about ncnn HOT 15 OPEN

Suncheng2022 commented on May 24, 2024

I convert onnx to ncnn successfully, but all my inference is all nan. Eg, the output of net.extract() is all nan.

from ncnn.

Comments (15)

Suncheng2022 commented on May 24, 2024 1

if you edit the file LayerNorm.cpp the op is still called LayerNorm, but the custom op is callled LayerNormalization according to "LayerNormalization not supported yet!" , so maybe u should declare a new op calss.

I know what you mean. I write two files named "LayerNormalization.h" and "LayerNormalization.cpp", and modified src/CMakeLists.txt with ncnn_add_layer(LayerNormalization), then compile it again. But it doesn't seem to work.

yeah, i got the same situation, but dont know why it didn't work

Help, please. @nihui

from ncnn.

HHscut commented on May 24, 2024

What is the result when transfering the model into .param & .bin. Some op not support? I check the output from different output layer and find it prints "NAN" after some middle layers, but I cant locate it. So maybe unsupport op exist, can you upload the original model file (be like .onnx) so I can check the model structure further.

from ncnn.

Suncheng2022 commented on May 24, 2024

What is the result when transfering the model into .param & .bin. Some op not support? I check the output from different output layer and find it prints "NAN" after some middle layers, but I cant locate it. So maybe unsupport op exist, can you upload the original model file (be like .onnx) so I can check the model structure further.

Thank you for your reply!
I have found the reason why the model outputs nan. The original author implemented a custom LayerNorm operation. This operation can be implemented in Pytorch:

class LayerNorm2d_Sc(nn.Module):
    """ 作者实现的自定义LayerNorm，理论上Pytorch通过调整维度是能做到的，我也验证了这一点，但是ncnn中暂无法实现 """

    def __init__(self, channels, eps=1e-6):
        super(LayerNorm2d_Sc, self).__init__()
        self.register_parameter('weight', nn.Parameter(torch.ones(channels)))
        self.register_parameter('bias', nn.Parameter(torch.zeros(channels)))
        self.eps = eps
        self.torch_layernorm = torch.nn.LayerNorm(channels, eps=eps, elementwise_affine=False)

    def forward(self, x):
        # 我尝试使用Pytorch的LayerNorm替换，Pytorch代码和导出的onnx均可以得到正常的结果，但转ncnn失败
        # C = x.shape[1]
        # x_ = x.clone()
        # x_ = x_.permute(0, 2, 3, 1)
        # y = self.torch_layernorm(x_)
        # y = y.permute(0, 3, 1, 2)
        # # y = self.weight.view(1, C, 1, 1) * y + self.bias.view(1, C, 1, 1)
        # return y

        # 原作者实现的自定义LayerNorm。Pytorch和导出的onnx均可以得到正常结果，但转ncnn后推理得到全黑的图像
        C = x.shape[1]
        x_ = x.clone()
        mu = x_.mean(dim=1, keepdim=True)
        var = (x_ - mu).pow(2).mean(dim=1, keepdim=True)
        y = (x_ - mu) / (var + self.eps).sqrt()
        y = self.weight.view(1, C, 1, 1) * y + self.bias.view(1, C, 1, 1)
        return y

I tried using numpy instead of Pytorch. The inference result was not completely black, but it was not normal either.
I saw in ncnn's wiki that the implementation layer can be customized, and I am trying to add the author's custom LayerNorm (if I understand correctly, the dimension processed by the ncnn model in C++ is WHC, and the output is also WHC. But in Python, ncnn output seems to be CHW. At least I can get normal results by CHW. Of course, I am more concerned about the results in C++.)

from ncnn.

HHscut commented on May 24, 2024

Hello!
1、but in my practice，the dimension processed by the ncnn model in C++ is also CDHW, and the output is also CDHW. See the code in C++ to flatten the output below. It means [Batch,Channel,Height,Width]. So,

void pretty_print(const ncnn::Mat &m, std::vector<float> &vec_heap) {
    for (int q = 0; q < m.c; q++) {
        const float *ptr = m.channel(q);
        for (int z = 0; z < m.d; z++) {
            for (int y = 0; y < m.h; y++) {
                for (int x = 0; x < m.w; x++) {
                    vec_heap.emplace_back(ptr[x]);
                }
                ptr += m.w;
            }
        }
    }
}

2、Your own LayerNorm2d_Sc works the same with the original one. If your own LayerNorm2d_Sc works but fails in transfering to ncnn model. Maybe you can update the ncnn version and compile the layernorm operation (see #5262 (comment) for detail). Could you post the error message?

from ncnn.

HHscut commented on May 24, 2024

And for 转ncnn后推理得到全黑的图像 , maybe u need to re-normalize the output to [0,256] and get the final output.

from ncnn.

Suncheng2022 commented on May 24, 2024

What is the result when transfering the model into .param & .bin. Some op not support? I check the output from different output layer and find it prints "NAN" after some middle layers, but I cant locate it. So maybe unsupport op exist, can you upload the original model file (be like .onnx) so I can check the model structure further.
Here is the onnx from Pytorch w/o onnxsim.
model_trace_1.4M_512.onnx.zip

from ncnn.

Suncheng2022 commented on May 24, 2024

Hello! 1、but in my practice，the dimension processed by the ncnn model in C++ is also CDHW, and the output is also CDHW. See the code in C++ to flatten the output below. It means [Batch,Channel,Height,Width]. So,
void pretty_print(const ncnn::Mat &m, std::vector<float> &vec_heap) {
    for (int q = 0; q < m.c; q++) {
        const float *ptr = m.channel(q);
        for (int z = 0; z < m.d; z++) {
            for (int y = 0; y < m.h; y++) {
                for (int x = 0; x < m.w; x++) {
                    vec_heap.emplace_back(ptr[x]);
                }
                ptr += m.w;
            }
        }
    }
}
2、Your own LayerNorm2d_Sc works the same with the original one. If your own LayerNorm2d_Sc works but fails in transfering to ncnn model. Maybe you can update the ncnn version and compile the layernorm operation (see #5262 (comment) for detail). Could you post the error message?

I do these operations for getting ncnn: Pytorch model --> onnxsim --> ncnn. But I got "LayerNormalization not supported yet!" when turning it to ncnn

./onnx2ncnn model_trace_1.4M_512_sim.onnx test_ncnn.param test_ncnn.bin 
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06
LayerNormalization not supported yet!
  # axis=-1
  # epsilon=1e-06

The number of errors reported may correspond to the number of custom LayerNorm operations.
In addition, I try to add LayerNorm in ncnn to implement the following:

// modified in src/layer/layernorm.cpp
else if (affine_size == channels)
        {
            #pragma omp parallel for num_threads(opt.num_threads)
            for (int i = 0; i < size; i++)
            {
                // mean
                float sum = 0.f;
                for (int q = 0; q < channels; q++)
                {
                    sum += bottom_top_blob.channel(q)[i];
                }
                float mean = sum / channels;
                // var
                float sqsum = 0.f;
                float tmp = 0.f;
                for (int q = 0; q < channels; q++)
                {
                    tmp = bottom_top_blob.channel(q)[i] - mean;
                    sqsum += tmp * tmp;
                }
                float var = sqsum / channels;

                float a = 1.f / (sqrtf(var + eps));
                float b = -mean * a;
                for (int q = 0; q < channels; i++)
                {
                    bottom_top_blob.channel(q)[i] = bottom_top_blob.channel(q)[i] * a + b;
                }
            }
        }

And execute the command under ncnn/build:

cmake ..
make -j64
make install```
When I turned onnx-sim file to ncnn, I got the same error above. 

Thanks again for your reply, and I believe I can figure ncnn out with your help.^_^

from ncnn.

HHscut commented on May 24, 2024

Haha I got "LayerNormalization not supported yet!" when turning it to ncnn too.

from ncnn.

Suncheng2022 commented on May 24, 2024

Haha I got "LayerNormalization not supported yet!" when turning it to ncnn too.

I added the LayerNorm implementation of ncnn, why is it still not supported? It feels like the conversion process does not call ncnn’s LayerNorm.

from ncnn.

HHscut commented on May 24, 2024

Haha I got "LayerNormalization not supported yet!" when turning it to ncnn too.

I added the LayerNorm implementation of ncnn, why is it still not supported? It feels like the conversion process does not call ncnn’s LayerNorm.

1、I didn't try to register own op, but i think it should be a individual .h & .cpp file to declare the class LayerNormalization.
and then in /ncnn/src/CMakeLists.txt line 169 add ncnn_add_layer(LayerNormalization)

from ncnn.

Suncheng2022 commented on May 24, 2024

Haha I got "LayerNormalization not supported yet!" when turning it to ncnn too.

I added the LayerNorm implementation of ncnn, why is it still not supported? It feels like the conversion process does not call ncnn’s LayerNorm.

1、I didn't try to register own op, but i think it should be a individual .h & .cpp file to declare the class LayerNormalization. and then in /ncnn/src/CMakeLists.txt line 169 add ncnn_add_layer(LayerNormalization)

I have tried to supplement the LayerNorm implementation in ncnn, added the LayerNormalization implementation according to the reference document add custom layer and recompiled.
When onnx is converted to ncnn, an error is still reported and the LayerNormalization operation is not supported.
Did I compile it incorrectly? (The compilation process prompts "Could NOT find protobuf (missing: protobuf_DIR)", but subsequent execution of make, etc. can also succeed)

1.LayerNorm in ncnn surpports normalization by channel dim:

2.Added new LayerNormalization implementation in ncnn, but it doesn't seem to work.

from ncnn.

HHscut commented on May 24, 2024

if you edit the file LayerNorm.cpp the op is still called LayerNorm, but the custom op is callled LayerNormalization according to "LayerNormalization not supported yet!" , so maybe u should declare a new op calss.

from ncnn.

Suncheng2022 commented on May 24, 2024

if you edit the file LayerNorm.cpp the op is still called LayerNorm, but the custom op is callled LayerNormalization according to "LayerNormalization not supported yet!" , so maybe u should declare a new op calss.

I know what you mean. I write two files named "LayerNormalization.h" and "LayerNormalization.cpp", and modified src/CMakeLists.txt with ncnn_add_layer(LayerNormalization), then compile it again. But it doesn't seem to work.

from ncnn.

HHscut commented on May 24, 2024

if you edit the file LayerNorm.cpp the op is still called LayerNorm, but the custom op is callled LayerNormalization according to "LayerNormalization not supported yet!" , so maybe u should declare a new op calss.

I know what you mean. I write two files named "LayerNormalization.h" and "LayerNormalization.cpp", and modified src/CMakeLists.txt with ncnn_add_layer(LayerNormalization), then compile it again. But it doesn't seem to work.

yeah, i got the same situation, but dont know why it didn't work

from ncnn.

Suncheng2022 commented on May 24, 2024

if you edit the file LayerNorm.cpp the op is still called LayerNorm, but the custom op is callled LayerNormalization according to "LayerNormalization not supported yet!" , so maybe u should declare a new op calss.

I know what you mean. I write two files named "LayerNormalization.h" and "LayerNormalization.cpp", and modified src/CMakeLists.txt with ncnn_add_layer(LayerNormalization), then compile it again. But it doesn't seem to work.

yeah, i got the same situation, but dont know why it didn't work

Thanks again, I won't give up and solve this problem sooner or later. I must turn to ncnn, as it's perfect in my view.

from ncnn.

I convert onnx to ncnn successfully, but all my inference is all nan. Eg, the output of net.extract() is all nan. about ncnn HOT 15 OPEN

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent