Code Monkey home page Code Monkey logo

Comments (17)

hughperkins avatar hughperkins commented on June 29, 2024

So, not sure if this will answer your question, but quick background on what is im2col. Actually, some guy wrote a blog post about it, and can explain it better than me, let me dig that out. This one: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

So, what happens in convolution is:

  • incoming multi-dimensional tensors get 'unrolled' into big fluffy 2d matrices. im2col does this. im means the incoming tensor, and col means the unrolled 2d matrices
  • then a standard GEMM is run on these matrices, to produce a new matrix
  • the new matrix is re-rolled up to produce the output multi-dimensional tensor, this time using col2im, ie the other way around

I havent (yet) looked into what SpatialFullConvolution is doing, but presumably it does the exact same thing, just the geometry is slightly different?

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

(so, to perhaps answer your question: im2col is not so much the 'kernel for spatialconvolutoin', so much as a library routine, that is used by spatialconvolution, to unroll the matrices. then spatialconvolution calls GEMM. So, the relationship between the files/libraries looks something like:

spatialconvolution

from clnn.

ViliusT avatar ViliusT commented on June 29, 2024

Hi, thanks for the informative reply!

SpatialFullConvolution is also known as "Deconvolution", "Fractionaly strided convolution" and a host of other names;

In short, a forward pass on a Convolution is close to a backwards pass on FullConvolution/Deconvolution and vice-versa - you could see it at https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialFullConvolution.cu and https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialConvolutionMM.cu

It seemed like nearly a perfect candidate for porting attempt on my side, after few simpler implementations I did on LUA side with custom kernel calls.

Hmm..

So looking at the code - https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/lib/THCLNN/SpatialConvolutionMM.cpp, it seems that I got the idea correctly, but the issue is somewhere else - I assumed that this is due to the fact that im2col.ccp has this line
std::string uniqueName = "SpatialConvolutionMM::im2col";

I would somehow should add
std::string uniqueName = "SpatialFullConvolution::im2col"; as well for it to be callable from SpatialFullConvolution - is that incorrect?

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

What makes you feel you need to modify im2col?

On September 7, 2016 1:44:27 PM GMT+01:00, ViliusT [email protected] wrote:

Hi, thanks for the informative reply!

SpatialFullConvolution is also known as "Deconvolution", "Fractionaly
strided convolution" and a host of other names;

In short, a forward pass on a Convolution is close to a backwards pass
on FullConvolution/Deconvolution and vice-versa - you could see it at
https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialFullConvolution.cu
and
https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialConvolutionMM.cu

It seemed like nearly a perfect candidate for porting attempt on my
side, after few simpler implementations I did on LUA side with custom
kernel calls.

Hmm..

So looking at the code -
https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/lib/THCLNN/SpatialConvolutionMM.cpp,
it seems that I got the idea correctly, but the issue is somewhere else

  • I assumed that this is due to the fact that im2col.ccp has this line
    std::string uniqueName = "SpatialConvolutionMM::im2col";

I would somehow should add
std::string uniqueName = "SpatialFullConvolution::im2col"; as well
for it to be callable from SpatialFullConvolution - is that incorrect?

You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#42 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

from clnn.

ViliusT avatar ViliusT commented on June 29, 2024

Evidently, lack of experience :)

clnn/THCLNN.lua

Line 164 in a7453eb

-- build function table
seems to bind Cl function equivalents to standard THNN functions;

I assume, when the correct .cpp is added to CMakeLists.txt , it should automatically be added to THCLNN.h / THCLNN.lua when building - this is where it fails for me. I (wrongly?) assumed that this is due to im2col.cpp, since when comparing to SpatialConvolutionMM implementation, this was the only difference I could spot - there was a reference in im2col to SpatialConvolutionMM in

    TemplatedKernel kernelBuilder(cl);
    kernel = kernelBuilder.buildKernel(uniqueName, "SpatialConvolutionMM.cl",
      getKernelTemplate(), "col2im_kernel");

(even though you mentioned in the guideliness that this filename is not improtant)
From lack of deeper understanding, I came to a conclusion that this caused the build to fail to include the calls in THCLNN.h; I'll have another look at the code and see if missed something.

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

there was a reference in im2col to SpatialConvolutionMM

Ah, good point. Updated im2col in d1701e6 to no longer contain references to spatialconvolution :-)

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

Your branch builds ok for me:

$ git checkout -b vilius-spatialfullconv vilius/feature/spatialFullConv 
Branch vilius-spatialfullconv set up to track remote branch feature/spatialFullConv from vilius.
Switched to a new branch 'vilius-spatialfullconv'
$ luarocks make rocks/clnn-scm-1.rockspec 
cmake -E make_directory build && cd build && cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_PREFIX_PATH="/home/ubuntu/torch-cl/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/username/torch-cl/install/lib/luarocks/rocks/clnn/scm-1" && make -j$(getconf _NPROCESSORS_ONLN) install

-- Found Torch7 in /home/ubuntu/torch-cl/install
-- Configuring done
-- Generating done
-- Build files have been written to: /home/username/torch-cl/opencl/clnn/build
Scanning dependencies of target THCLNN
[  0%] Built target cog_clnn
Scanning dependencies of target clnn
[  6%] Building CXX object CMakeFiles/clnn.dir/init.cpp.o
[ 13%] Building CXX object lib/THCLNN/CMakeFiles/THCLNN.dir/SpatialFullConvolution.cpp.o
[ 20%] Building CXX object lib/THCLNN/CMakeFiles/THCLNN.dir/im2col.cpp.o
[ 26%] Linking CXX shared module libclnn.so
[ 33%] Built target clnn
[ 40%] Linking CXX shared module libTHCLNN.so
[100%] Built target THCLNN
Install the project...

Do you mean, it fails at runtime?

from clnn.

ViliusT avatar ViliusT commented on June 29, 2024

Yes; Apologies - I explained it improperly;

Upon calling the SpatialFullConvolution as a module via LUA, say, such testing script:


local input = torch.zeros(3,12,12)
input = input:cl()
local gconv = nn.SpatialFullConvolution(3,1,3,3,2,2):cl()
print (gconv)
local out = gconv:forward(input)
print (out)

It fails with the following:

/home/vilius/torch-cl/install/bin/luajit: ...cl/install/share/lua/5.1/clnn/SpatialFullConvolution.lua:55: attempt to call field 'SpatialFullConvolution_updateOutput' (a nil value)
stack traceback:
    ...cl/install/share/lua/5.1/clnn/SpatialFullConvolution.lua:55: in function 'forward'
    TestQuick.lua:38: in main chunk
    [C]: in function 'dofile'
    ...s/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

The line in question is this call to function:

 inputTensor.THNN.SpatialFullConvolution_updateOutput(
    inputTensor:cdata(),
    self.output:cdata(),
    self.weight:cdata(),
    THNN.optionalTensor(self.bias),
    self.finput:cdata(),
    self.fgradInput:cdata(),
    self.nInputPlane,
    self.nOutputPlane,
    self.kW, self.kH,
    self.dW, self.dH,
    self.padW, self.padH,
    adjW, adjH
  )

I take it, this is because there is no binding in https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/THCLNN.lua for the function I wrote - and it is not created when I try to build; It is not there for SpatialFullConvolution after using the python port script either;

In short - I am not sure how to make my function callable from LUA. I assume I need to bind it somehow, as other functions are - but how to achieve that?

from clnn.

ViliusT avatar ViliusT commented on June 29, 2024

And if I add them manually (see ViliusT@82321b7 ), upon trying to require 'clnn' I get


libthclnn_searchpath    /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so  
not found: THNN_ClSpatialFullConvolution_updateOutput/home/vilius/torch-cl/install/share/lua/5.1/nn/THNN.lua:804: /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so: undefined symbol: THNN_ClSpatialFullConvolution_updateOutput  
not found: THNN_ClSpatialFullConvolution_updateGradInput/home/vilius/torch-cl/install/share/lua/5.1/nn/THNN.lua:804: /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so: undefined symbol: THNN_ClSpatialFullConvolution_updateGradInput    
not found: THNN_ClSpatialFullConvolution_accGradParameters/home/vilius/torch-cl/install/share/lua/5.1/nn/THNN.lua:804: /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so: undefined symbol: THNN_ClSpatialFullConvolution_accGradParameters
/home/vilius/torch-cl/install/bin/luajit: /home/vilius/torch-cl/install/share/lua/5.1/trepl/init.lua:384: /home/vilius/torch-cl/install/share/lua/5.1/clnn/test.lua:1: attempt to call field 'TestSuite' (a nil value)
stack traceback:
    [C]: in function 'error'
    /home/vilius/torch-cl/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
    justatest.lua:4: in main chunk
    [C]: in function 'dofile'
    ...s/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

Alright, so working through this ,and remembering at the same time, this is realtime :-P Or at least this is what I did, in no particular sequence, just what seemed an ok-ish guess at the time

# check does the library contain our exported symbol?
$ nm -C ~/torch-cl/install/lib/libTHCl.so | grep SpatialFullConvolution
# and... nothing :-P

Seems like you havent exported the symbol? This means it's probably not in the header file. check the headerfile:

$ grep SpatialFull lib/THCLNN/THCLNN.h 
TH_API void THNN_ClSpatialFullConvolution_updateOutput(
TH_API void THNN_ClSpatialFullConvolution_updateGradInput(
TH_API void THNN_ClSpatialFullConvolution_accGradParameters(

Ok. its in the headerfile. Is it in the CMakeLists.txt? :-P

$ grep SpatialFull lib/THCLNN/CMakeLists.txt  
set(src-cl Abs.cpp SpatialConvolutionMM.cpp SpatialFullConvolution.cpp ELU.cpp SpatialAveragePooling.cpp SpatialMaxPooling.cpp

Yes it is... hmmm... is it in SpatialFullConvolution.cpp?

$ grep ClSpatialFull lib/THCLNN/SpatialFullConvolution.cpp 
void THNN_ClSpatialFullConvolution_updateOutput(
void THNN_ClSpatialFullConvolution_updateGradInput(
void THNN_ClSpatialFullConvolution_accGradParameters(
std::string THNN_ClSpatialFullConvolution_getKernelTemplate() {

Yes, it is. Ummm...
Check that other ... oh, I'm looking in the wrong .so :-P I'm looking in cltorch so, not in clnn .so. Check clnn .so:

$ nm -C ~/torch-cl/install/lib/lua/5.1/libTHCLNN.so | grep Convolution
00000000000034a0 t _GLOBAL__sub_I_SpatialConvolutionMM.cpp
00000000000034d0 t _GLOBAL__sub_I_SpatialFullConvolution.cpp
00000000000044b0 T THNN_ClSpatialConvolutionMM_accGradParameters
0000000000003ff0 T THNN_ClSpatialConvolutionMM_updateGradInput
00000000000039e0 T THNN_ClSpatialConvolutionMM_updateOutput
0000000000004a20 T THNN_ClSpatialFullConvolution_updateOutput(THClState*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, int, int, int, int, int, int, int, int, int, int)
0000000000004f00 T THNN_ClSpatialFullConvolution_updateGradInput(THClState*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, int, int, int, int, int, int, int, int, int, int)
00000000000052e0 T THNN_ClSpatialFullConvolution_accGradParameters(THClState*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, int, int, int, int, int, int, int, int, int, int, float)

ok, there it is, its exported. Is it a C export or a C++ export? If it's C export (what we want), if I remove the -C from nm, ti should still show an unmangled symbol:

00000000000034a0 t _GLOBAL__sub_I_SpatialConvolutionMM.cpp
00000000000034d0 t _GLOBAL__sub_I_SpatialFullConvolution.cpp
00000000000044b0 T THNN_ClSpatialConvolutionMM_accGradParameters
0000000000003ff0 T THNN_ClSpatialConvolutionMM_updateGradInput
00000000000039e0 T THNN_ClSpatialConvolutionMM_updateOutput
0000000000004a20 T _Z42THNN_ClSpatialFullConvolution_updateOutputP9THClStateP10THClTensorS2_S2_S2_S2_S2_iiiiiiiiii
0000000000004f00 T _Z45THNN_ClSpatialFullConvolution_updateGradInputP9THClStateP10THClTensorS2_S2_S2_S2_iiiiiiiiii
00000000000052e0 T _Z47THNN_ClSpatialFullConvolution_accGradParametersP9THClStateP10THClTensorS2_S2_S2_S2_S2_iiiiiiiiiif

Looks mangled. Compare with SpatialConvolutionMM. Means when we compile the .cpp file, we are missing an extern "C" { on our symbols. Does the .cpp file import the appropriate header .h file?

$ grep include lib/THCLNN/SpatialFullConvolution.cpp
lib/THCLNN/SpatialFullConvolution.cpp:#include "utils.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "luaT.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClTensor.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClTensorMath.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClBlas.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClKernels.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "templates/TemplatedKernel.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "DeviceInfo.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "EasyCL.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THCLNN.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "im2col.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include <iostream>
lib/THCLNN/SpatialFullConvolution.cpp:#include <string>

Does not. I reckon if you add an include for SpatialFullConvolution.h, and rebuild, it will magically get mildly further :-)

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

Oh wait, it does include it, it's the THCLNN.h file. Hmmm...

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

let's compare the declaration with the definition. Declaration is:

TH_API void THNN_ClSpatialFullConvolution_updateOutput(
          THClState *state,
          THClTensor *input,
          THClTensor *output,
          THClTensor *weight,
          THClTensor *bias,
          THClTensor *columns,
          THClTensor *ones,
          int kW, int kH,
          int dW, int dH,
          int padW, int padH,
          int adjW, int adjH);

Definition (.cpp file) is:

void THNN_ClSpatialFullConvolution_updateOutput(
  THClState *state,
  THClTensor *input,
  THClTensor *output,
  THClTensor *weight,
  THClTensor *bias,
  THClTensor *columns,
  THClTensor *ones,
  int nInputPlane,
  int nOutputPlane,
  int kW, int kH,
  int dW, int dH,
  int padW, int padH,
  int adjW, int adjH) {

Ah, one has nInputPlane, nOutputPlane, and one does not, and therefore when it builds the definition, it ignores the declaration, which is for a different function essentially.

I think that if you reconcile the declaration with the definition, it might get further.

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

(TH_API contains extern "C" by the way. in ~/torch-cl/pkg/torch/TH/THGeneral.h.in:

https://github.com/hughperkins/torch7/blob/master/lib/TH/THGeneral.h.in#L18-L32

#ifdef __cplusplus
# define TH_EXTERNC extern "C"
#else
# define TH_EXTERNC extern
#endif

#ifdef _WIN32
# ifdef TH_EXPORTS
#  define TH_API TH_EXTERNC __declspec(dllexport)
# else
#  define TH_API TH_EXTERNC __declspec(dllimport)
# endif
#else
# define TH_API TH_EXTERNC
#endif

)

from clnn.

ViliusT avatar ViliusT commented on June 29, 2024

Oh wow!

Before we dig further...

This commit - torch/nn@b339dad - it seems it is not in the torch-cl distribution?

Hence, no THNN_(SpatialFullConvolution_<...>) calls in THNN.lua in the nn package, and then, THNN_ClSpatialFullConvolution_<..> does't have anything to bind with, no?

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

This commit - torch/nn@b339dad - it seems it is not in the torch-cl distribution?

Yes, so, good observation. So, this commit is dated 31st january, but it was merged to master on 18th February: torch/nn#601

Now, ... as a maintainer of an OpenCL port of torch, the THNN migration caused me a llloottt of pain:

Since I do have a full time job, I couldnt really keep up with these changes, and clnn basically spent half its time not working. So, as you can see in issue 39, I finally forked https://github.com/torch/distro to https://github.com/hughperkins/distro-cl , and rolled everything back to ~21 Februrary.

So, you might think that 18th Februrary should get in, but its "approximately" 21 february, and I guess it just slightly missed.

On the whole, in general, the path of least resistance is to pretend that any changes to torch from then on didnt happen, or occasionally cherry-pick anything really critical.

To what extent do you need any changes from after distro was forked to distro-cl, and to what extent can you get things working using the ~21 February version of distro?

from clnn.

ViliusT avatar ViliusT commented on June 29, 2024

I knew that distro-cl was forked at end of February - since I saw that that commit was from January, didn't double check and just assumed that it was in; My mistake.

I'll see what I can do and get back to you.

And a big thanks for maintaining the project!

from clnn.

hughperkins avatar hughperkins commented on June 29, 2024

k, sounds good :-)

Not saying we cant port stuff from upstream by the way. But my experience is that any single upstream commit contains a whole bunch of changes, and conversely a specific change you want might be spread over multiple commits, so bringing stuff from upstream is a bit icky :-P

Nevertheless can be done, with a bit of effort.

from clnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.