Code Monkey home page Code Monkey logo

Comments (4)

maleadt avatar maleadt commented on August 30, 2024 1

Try running with JULIA_DEBUG=CUDNN (on latest CUDA.jl) and comparing the params to the error causes listed in https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnConvolutionBackwardFilter.

from nnlibcuda.jl.

DrChainsaw avatar DrChainsaw commented on August 30, 2024

@denizyuret, @maleadt sorry for poking.

Would it be easy for you to spot if the below function is correct usage of the Cudnn API? Perhaps this issue belongs in CUDA.jl instead if so?

function ∇conv_filter!(dw::DenseCuArray{T}, x::DenseCuArray{T}, dy::DenseCuArray{T},
cdims::DenseConvDims; alpha=1, beta=0, algo=-1) where T<:CUDNNFloat
if cudnnversion() < v"6"
all(x -> x == 1, dilation(cdims)) || error("Only dilation = 1 is supported in cuDNN version < 6")
end
if algo != -1
@warn "The algo option has been deprecated, the fastest algo is computed automatically" maxlog=1
end
alpha, beta = scalingParameter(T,alpha), scalingParameter(T,beta);
xDesc, yDesc, wDesc = cudnnTensorDescriptor(x), cudnnTensorDescriptor(dy), cudnnFilterDescriptor(dw)
convDesc = cudnnConvolutionDescriptor(cdims, x)
p = cudnnConvolutionBwdFilterAlgoPerf(xDesc, x, yDesc, dy, convDesc, wDesc, dw);
@workspace size=p.memory workspace->cudnnConvolutionBackwardFilter(handle(), alpha, xDesc, x, yDesc, dy, convDesc, p.algo, workspace, sizeof(workspace), beta, wDesc, dw);
return dw
end

from nnlibcuda.jl.

DrChainsaw avatar DrChainsaw commented on August 30, 2024

Thanks. Thats some great debug output!

For some reason it did not print everything the first time, I had to rerun a couple of times until the relevant function appeared.

ERROR: CUDNNError: CUDNN_STATUS_BAD_PARAM┌ Debug: CuDNN (v8200) function cudnnConvolutionBackwardFilter() called:
│     handle: type=cudnnHandle_t; streamId=00000000B8F8EDC0;
│     alpha: type=CUDNN_DATA_FLOAT; val=1.000000;
│     xDesc: type=cudnnTensorDescriptor_t:
│         dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
│         nbDims: type=int; val=4;
│         dimA: type=int; val=[1,3,3,3];
│         strideA: type=int; val=[27,9,3,1];
│     xData: location=dev; addr=0000000203C01000;
│     dyDesc: type=cudnnTensorDescriptor_t:
│         dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
│         nbDims: type=int; val=4;
│         dimA: type=int; val=[1,64,1,1];
│         strideA: type=int; val=[64,1,1,1];
│     dyData: location=dev; addr=0000000203C01600;
│     convDesc: type=cudnnConvolutionDescriptor_t:
│         mode: type=cudnnConvolutionMode_t; val=CUDNN_CONVOLUTION (0);
│         dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
│         mathType: type=cudnnMathType_t; val=CUDNN_DEFAULT_MATH (0);
│         reorderType: type=int; val=0;
│         arrayLength: type=int; val=2;
│         padA: type=int; val=[0,0];
│         strideA: type=int; val=[1,1];
│         dilationA: type=int; val=[1,1];
│         groupCount: type=int; val=1;
│     algo: type=cudnnConvolutionBwdFilterAlgo_t; val=CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1 (1);
│     workSpace: location=dev; addr=0000000203C03000;
│     workSpaceSizeInBytes: type=unsigned long long; val=5400;
│     beta: type=CUDNN_DATA_FLOAT; val=0.000000;
│     dwDesc: type=cudnnFilterDescriptor_t:
│         dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
│         vect: type=int; val=0;
│         nbDims: type=int; val=4;
│         dimA: type=int; val=[64,3,3,3];
│         format: type=cudnnTensorFormat_t; val=CUDNN_TENSOR_NCHW (0);
│     dwData: location=dev; addr=0000000203C01A00;
│ Time: 2021-06-14T22:20:18.800688 (0d+0h+1m+4s since start)
│ Process=15972; Thread=14580; GPU=0; Handle=00000000E8189540; StreamId=00000000B8F8EDC0.
└ @ CUDA.CUDNN E:\Programs\julia\.julia\packages\CUDA\mVgLI\lib\cudnn\CUDNN.jl:123
 (code 3)
Stacktrace:

I could not spot anything violating the conditions listed for bad param. Looking at the table of supported algos, it seems like the datatypes for xDesc, dyDesc, convDesc and dwDesc represent the TRUE_HALF_CONFIG which is not listed as being supported by CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1, but that should have yielded a CUDNN_STATUS_NOT_SUPPORTED, right?

Attempt to be slightly more than useless by listing each condition and what I think is the relevant part of output:

  • At least one of the following is NULL: handle, xDesc, dyDesc, convDesc, dwDesc, xData, dyData, dwData, alpha, beta
handle: type=cudnnHandle_t; streamId=00000000B8F8EDC0;
xDesc: type=cudnnTensorDescriptor_t: ...
dyDesc: type=cudnnTensorDescriptor_t: ...
convDesc: type=cudnnConvolutionDescriptor_t: ...
dwDesc: type=cudnnFilterDescriptor_t: ...
xData: location=dev; addr=0000000203C01000;
dyData: location=dev; addr=0000000203C01600;
dwData: location=dev; addr=0000000203C01A00;
alpha: type=CUDNN_DATA_FLOAT; val=1.000000;
beta: type=CUDNN_DATA_FLOAT; val=0.000000;

or could one of those addrs be pointing to NULL?

  • xDesc and dyDesc have a non-matching number of dimensions
xDesc: nbDims: type=int; val=4;
dyDesc: nbDims: type=int; val=4;
  • xDesc and dwDesc have a non-matching number of dimensions
xDesc: nbDims: type=int; val=4;
dwDesc: nbDims: type=int; val=4;
  • xDesc has fewer than three number of dimensions
xDesc: nbDims: type=int; val=4;
  • xDesc, dyDesc, and dwDesc have a non-matching data type.
xDesc: dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
dyDesc: dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
dwDesc: dataType: type=cudnnDataType_t; val=CUDNN_DATA_HALF (2);
  • xDesc and dwDesc have a non-matching number of input feature maps per image (or group in case of grouped convolutions).
xDesc: dimA: type=int; val=[1,3,3,3];
dwDesc: dimA: type=int; val=[64,3,3,3];
              format: type=cudnnTensorFormat_t; val=CUDNN_TENSOR_NCHW (0);
  • yDesc or dwDesc indicate an output channel count that isn't a multiple of group count (if group count has been set in convDesc).
 convDesc: groupCount: type=int; val=1;

from nnlibcuda.jl.

DrChainsaw avatar DrChainsaw commented on August 30, 2024

Did a bit of hacking around, and it seems like changing the algo to _WINOGRAD_NONFUSED (which seems to be the only one with support for TRUE_HALF_CONFIG ) resulted in CUDNN_STATUS_NOT_SUPPORTED, not sure why since all the fine print in the last column seems to be fulfilled.

Edit: NVM, I missed that the order of returned algos was not deterministic. _WINOGRAD_NONFUSED works. Btw, it says CUDA.CUDNN.CUDNN_STATUS_ALLOC_FAILED for the default workspace size but it seems to succeed anyways (data looks the same as with ALGO_1 and PSEUDO_HALF_CONFIG). This might explain why the small filter size works but not the large. Could changing the workspace size in cudnnFindConvolutionAlgorithmWorkspaceSize fix this, or perhaps accepting CUDNN_STATUS_ALLOC_FAILED in cudnnConvolutionAlgoPerfChoose (sounds risky)?

Changing the convDesc data type to Float32 so that the data type configuration becomes PSEUDO_HALF_CONFIG also works.

Is this the correct fix in cases when _WINOGRAD_NONFUSED is not applicable? The support for TRUE_HALF_CONFIG seems quite limited.

from nnlibcuda.jl.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.