Code Monkey home page Code Monkey logo

nnvm-fusion's Introduction

NNVM-Fusion: Implement GPU Kernel Fusion and Runtime Compilation Based on NNVM

NNVM-Fusion is a module which implements GPU kernel fusion and runtime compilation based on NNVM. It can be easily used as a plugin of NNVM in different deep learning systems to gain a boost on performance.

What is GPU kernel fusion and runtime compilation

GPU kernel fusion is an optimization method to reduce overhead of data transfer from global memory by fusing some sequential kernels into a single, large one, to improve performance and memory locality.

Fusion will generate the CUDA codes of fused kernel, which requires us to compile it during runtime. In this part, we wrap NVRTC to do this job.

How it works

This module is implemented based on the well defined concepts provided by NNVM. So we can implement this module as three passes on the computation graph: {Fusion, CodeGen, RTCGen}.

  • Fusion Pass: detects patterns can be fused in computation graph, and generates the ASTs(Abstract Syntax Tree) expressing the code structure.
  • CodeGen Pass: uses the ASTs to generate real CUDA codes.
  • RTCGen Pass: compiles the CUDA codes to functions can be called during runtime.

Performance

we have done some benchmark tests of the training performance on LeNet and ResNet, based on TinyFlow. We compared the training speed between CPU, GPU and GPU with NNVM-Fusion. It demonstrates that NNVM-Fusion can improve the GPU performance by 1.4x-1.5x on LeNet and 1.1x-1.3x on ResNet with medium batch size. We also compare the training speed with the same model on TensorFlow. With NNVM-Fusion, TinyFlow's performance is on par with TensorFlow on ResNet, and better on LeNet.

perf_lenet perf_resnet

nnvm-fusion's People

Contributors

tqchen avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.