I noticed that the ConvLayer.forward is being benchmarked by <code class="notranslate"

Speed-up of ConvLayer.forward about convnetjs HOT 7 CLOSED

karpathy commented on June 7, 2024

Speed-up of ConvLayer.forward

from convnetjs.

Comments (7)

karpathy commented on June 7, 2024

This is a Awesome, thanks a lot for looking into it - I wasn't aware of type hinting.

I included build/convnet.js and the minified version for convenience, but I agree that it's a little messy with it there. Do you know what people usually do in these cases? I'd like to release the latest built version here on Github.

And we should definitely fold this into convnetjs, replacing the old code. I'll first wait to see if you happen to have suggestions on what we should do with built files. As a side note, I might also look into eventually converting the backprop part, and fullyconnected layer and pooling layer with the same tricks.

Lastly, I have a WebGL version of ConvNetJS in my local repo and it's almost nicely wired in. It's a forward_GPU funciton and it is extremely fast compared to this code.

Thanks,
Andrej

from convnetjs.

mdda commented on June 7, 2024

Looks like the right way to do binary/machine-generated assets is with "releases".
First, git rm ./build/convnet.*js, then add them to .gitignore.

Then, use a script like https://pypi.python.org/pypi/ghrelease/0.1.2 to upload the convnet.min.js asset individually.

Perhaps the version number could be (for instance) 2014.08.31, so as to avoid version increment anxiety...

It's not ideal, though, is it?

from convnetjs.

mdda commented on June 7, 2024

The type-hinting (particularly on stride) was a big win (and that should be placeable higher up the food chain, so that it works more generally too). The other win was re-ordering the loops, so that they execute more in order of memory placement (row-wise, rather than column-wise), since that apparently makes the code more cache-friendly. Surprisingly, factoring out in-loop constants (like the array arithmetic) didn't help much.

I'd love to have a look at the WebGL stuff : Since that was already solidly in my plan for using convnet.js (FWIW, I'm one of the contributors for https://github.com/stackgl/shader-school). My main goal was to implement the back-prop step on the GPU, to reduce training time... I know there are other capable convnet modules out there, but I particularly want to retain client-side compatibility for the trained network.

If I can 'get in' on the WebGL side, I'd be happy to contribute to doing the (more basic) JS-side optimisations too.

from convnetjs.

karpathy commented on June 7, 2024

Thanks, I thought releases might be the preferred way. I noticed a few repos using them for this purpose but never fully read up on it.

My WebGL implementation is essentially a wrapper around some core functions inside jpcnn (a really nice library from Pete Warden). Among other things he implemented gemm ("General Matrix Multiply" as seen in BLAS) in WebGL. This can be used to do very fast convolutions in a straight forward way: you reshape all patches into rows, take filters as columns, matrix multiply, and then reshape the result back into correct output dimensions. It's a little wasteful in terms of space (because you have to more than duplicate all image pixels, WITH overlaps in their reshaped form), but it's an often used strategy (in very early CNNs by LeCun and also for example in Caffe).

However jpcnn only implements forward pass, not backprop.

I'll look at cleaning all of this mess up today, and maybe fold in some of my preliminary WebGL functionality.

from convnetjs.

mdda commented on June 7, 2024

Great!

Thanks for pointing me towards Peter Warden, whose blog (http://petewarden.com/) is super-interesting & in-depth. And, in case anyone else is looking for the WebGL behind the jpcnn library : have a look at https://github.com/jetpacapp/DeepBeliefSDK/blob/gh-pages/JavascriptLibrary/jpcnn.js#L1954

I'll check back in a day or two, and see how 'mergeable' my patch will be by then.

All the Best
Martin
:-)

from convnetjs.

karpathy commented on June 7, 2024

Ok, I removed the built library from repo and created a release instead. I also folded your optimizations into ConvLayer (slightly modified), and also tweaked the backward pass so that the same optimizations are applied to backprop for Conv layer.

Now moving on to incorporating the GPU code. It's a little tricky because it relies on jpcnn, which in turn relies on underscore. I'm not sure what the cleanest way to add these is then. Should I include them in the compile as all the other convnetjs files? It would make the entire library quite a bit larger, but I'm not sure if there is any other way.

from convnetjs.

karpathy commented on June 7, 2024

Thanks Martin, end result of this issue: the ConvLayer is now twice as fast (both forward and backward pass). For future (dramatic) improvements we are moving towards WebGL. Closing the issue.

from convnetjs.

Speed-up of ConvLayer.forward about convnetjs HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent