Code Monkey home page Code Monkey logo

Comments (15)

sekwiatkowski avatar sekwiatkowski commented on August 23, 2024

CUDA support and optimizations are being implemented on a step-by-step basis.

Currently, CublasProjectionLayer is a drop-in replacement for ProjectionLayer. The next step will be a kernel-based implementation of the sigmoid activation function.

Following that, I will introduce the idea of instructions (modelled as data classes) as a unified way to specify the layers in a neural networks. A CPU interpreter and a CUDA interpreter will process the instructions to create architecture-specific networks. CUDA-based networks will minimize the data transfer between the CPU host and the GPU device.

You can find the changelog here: https://medium.com/@komputation

Starting with v0.7.1, it includes some notes regarding the subject of host/device communication.

from komputation.

zjuhasz avatar zjuhasz commented on August 23, 2024

Are there any plans for an OpenCL interpreter along these same lines as the CUDA interpreter?

from komputation.

sekwiatkowski avatar sekwiatkowski commented on August 23, 2024

Do you have a specific use case in mind for an OpenCL interpreter?

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

nice progress @sekwiatkowski !

question -- does your design allow for forward and backward propagation of a batch of training examples at once? this would be faster because it would allow you do to more matrix-matrix multiplication instead of just matrix-vector. can't tell 100% yet whether or not you are doing it.

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

Ok, i see now in Network and CudaNetwork that it is definitely doing one training example at a time . . . am i correct that other libraries generally do multiple at once?

from komputation.

sekwiatkowski avatar sekwiatkowski commented on August 23, 2024

Parallel propagation for mini-batches will probably be added in v0.8 (some time next week).

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

nice!

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

so . . . floats vs. doubles . . . never an easy question in java . . . i get the sense that most people run networks with floats on the GPU. i'm sure you've thought about this to some extent . . . easier said than done.

from komputation.

sekwiatkowski avatar sekwiatkowski commented on August 23, 2024

Doubles will probably be replaced by floats in the near future, but I have to run some benchmarks first.

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

so what are you thinking for the batch-appropriate datatype? some sort of tensor? clearly need all of the data for a batch to be in one long array to get the speed benefits of converting 32 matrix-vector multiplies into one big matrix-matrix multiply.

also, i am curious about the cpu-convolution plan . . . I am assuming you are going to use cudnn for GPU. One thing I found that could be useful is this:
https://github.com/01org/daal/blob/daal_2018_beta_update1/examples/java/com/intel/daal/examples/neural_networks/Conv2DLayerDenseBatch.java

as far as i can tell, this java code calls super-optimized intel mkl-ish c++ code.

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

also the obvious solution to batch is to use a matrix with every row as an observation. this runs into problems when one considers convolution though . . .

from komputation.

sekwiatkowski avatar sekwiatkowski commented on August 23, 2024

I will start to work on parallel propagation now. I'm inclined towards using a one-dimensional array for the entire mini-batch.

from komputation.

austin-sandia avatar austin-sandia commented on August 23, 2024

agreed all of the mini-batch data should be next to each other.

check out this thing, in particular the mkl. it provides java interface to sgemm and friends in mkl but does not require installation of the mkl library by the user--i.e. the mkl is bundled.

https://github.com/intel-analytics/BigDL-core/tree/162d95df3941976034691b266ae63401a580902b

<dependency> <groupId>com.intel.analytics.bigdl.native</groupId> <artifactId>mkl-java</artifactId> <version>0.1.0</version> </dependency> <dependency> <groupId>com.intel.analytics.bigdl.dnn.native</groupId> <artifactId>dnn-java</artifactId> <version>0.1.0</version> </dependency>

from komputation.

zjuhasz avatar zjuhasz commented on August 23, 2024

Do you have a specific use case in mind for an OpenCL interpreter?

Just for hardware compatibility. CUDA can only be used with Nvidia GPUs correct?

from komputation.

sekwiatkowski avatar sekwiatkowski commented on August 23, 2024

For the time being, only CUDA will be supported. This may change if AMD overtakes Nvidia performance-wise or OpenCL comes up in one of my projects. I'm also keeping an eye open for entirely different and new architectures.

from komputation.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.