Code Monkey home page Code Monkey logo

dlmo's Introduction

DLMO: Deep Learning Memory Optimizer

How to use

Step 1: Get traced function and profiling information

a. Trace:

# `func` is the function to be traced 
traced_func = parrots.aot.trace(func, input, target)

b. Profiling:

parrots.runtime.profile()
parrots.runtime.profile_memory()

c. Run it:

loss, acc = traced_func(input, target)

d. Save IR and profiling information:

parrots.aot.save(traced_func, 'function.json')
# Quit parrots and get `profile_memory.json` and `profile.txt`

Step 2: Generate merged IR JSON

a. Put original IR JSON function.json, timeline profiling result profile.txt (please rename to timeline.txt) and memory profiling result profile_memory.json (please rename to memory.json) into one directory under data, like:

├── data
│   ├── resnet152-32
│   │   ├── function.json
│   │   ├── memory.json
│   │   └── timeline.txt
│   ├── resnet50-32
│   │   ├── function.json
│   │   ├── memory.json
│   │   └── timeline.txt
│   ├── processor.py

b. Run processor.py (maybe failed, please debug by yourself)

python3 processor.py

c. You should get like this:

├── data
│   ├── resnet152-32
│   │   ├── function.json
│   │   ├── memory.json
│   │   ├── pattern.json
│   │   └── timeline.txt
│   ├── resnet50-32
│   │   ├── function.json
│   │   ├── memory.json
│   │   ├── pattern.json
│   │   └── timeline.txt
│   ├── processor.py

Step 3: Compile and Run!

a. Compile with CMake:

mkdir build
cd build

# Debug
cmake .. -DCMAKE_BUILD_TYPE=Debug
# Release (about 10x faster than debug binary)
cmake .. -DCMAKE_BUILD_TYPE=Release

make

b. Run:

# Usage: dlmo <input> <output> <limit>
./dlmo ../data/resnet152-32/pattern.json optimized.json 2.1GiB

c. The output should be like (you may have to read the source code and adjust the parameters):

Running case ../data/resnet152-32/pattern.json (2356 operators) with optimizer (limit 2.100000 GiB) ...
 > Start back-tracing search from source (peak memory: 8.019928 GiB, total time: 229.738614 ms)
 > Progress (300): 2.660553 GiB, 256.848531 ms
 > Progress (600): 2.094578 GiB, 276.179152 ms
 > Progress (900): 2.094578 GiB, 276.605152 ms
 > Progress (1200): 2.082615 GiB, 276.739483 ms
 > Reach search limit, stop searching
 > Result:
   > Schedules searched: 1500
   > Time used: 26677.157000 ms
   > Best: {peak memory: 2.094578 GiB, total time: 274.821156 ms}
   > Satisfy memory: true
 > Writing result into path optimized.json ... OK!

d. The newly generated IR JSON is named as <output>.

dlmo's People

Contributors

lyriczhao avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.