Code Monkey home page Code Monkey logo

toy-compiler's Introduction

The Toy Compiler

This project is an implementation of compiler using the Multi-Level Intermediate Representation (MLIR) framework. In the first stage, project is constructed by using the MLIR Toy examples, and it can be seen as a reconstruction of Toy. Therefore the name "Toy Compiler" is adopted. Here, there are several improvements during the reconstruction:

  • MLIR/LLVM project is added as a third party package using find_package in the CMakeLists. This makes Toy Compiler an independent project but not joinly built with MLIR.
  • The project directories are re-designed by taking Triton as a reference. This makes the project structure much clearer and easier for further extension.
  • .gitignore and .clang-format are added to make the Toy from the tutorial to a real project.

Getting Started with the Toy Compiler

How to build the project

Clone the project.

git clone https://github.com/LeiChenIntel/toy-compiler.git

Move to Toy Compiler based directory and update submodule. Now the project depends on GoogleTest.

cd $TOY_COMPILER_HOME_DIR
git submodule update --init

Build MLIR framework by following MLIR Getting Started.

The branch release/18.x is used. Note that using different branches might cause errors.

More details can be found in the checklist.

For Linux platform, build the project by

mkdir $TOY_COMPILER_HOME_DIR/build-x86_64
cd $TOY_COMPILER_HOME_DIR/build-x86_64
cmake -D CMAKE_PREFIX_PATH=$LLVM_HOME_DIR/build/lib/cmake/mlir -D CMAKE_BUILD_TYPE=Release ..
make -j${nproc}
  • MLIRConfig.cmake should be under path $LLVM_HOME_DIR/build/lib/cmake/mlir, or there will be an error during CMake generation.
  • If you want to put generated binaries under folder bin and lib, run cmake --build . --target install instead of make -j${nproc}.

For Windows platform, build the project by

mkdir $TOY_COMPILER_HOME_DIR/build-x86_64
cd $TOY_COMPILER_HOME_DIR/build-x86_64
cmake -G "Visual Studio 16 2019" -A x64 -D CMAKE_PREFIX_PATH="$LLVM_HOME_DIR/build/lib/cmake/mlir" -D CMAKE_BUILD_TYPE=Release ..
cmake --build . --target install

Other CMake options:

-DENABLE_TOY_BENCHMARKS: Enable benchmarks. Compare the results with loop and AVX instructions. Need AVX support and release build type. Default value ON.

-DENABLE_MATMUL_BENCHMARKS: Enable matrix multiplication benchmark. Need to install OpenBLAS library. Only supported on Ubuntu. Default value OFF.

How to run the target

toy-opt

toy-opt is designed to test single pass lit. File name for IR and pass name are necessary to run toy-opt.

Here is an example to run ConvertToyToMid pass by toy-opy,

./toy-opt toy_to_mid.mlir -convert-toy-to-mid

More usage can be found by running,

./toy-opt -h

toy-translate

toy-translate is designed to translate one IR to another IR. Basically, toy-translate is a pass pipeline and may include many passes.

Options:

-emit: Choose an IR to be dumped, i.e., ast, mlir, mlir-mid.

-opt: If optimization pass is added, such as canonicalization and common sub-expressions elimination.

-lower-pat: Choose loop or vectorization during the lowering, i.e., loop, vector.

Here is an example to dump MidIR by toy-translate,

./toy-translate add_mlir.toy -emit=mlir-mid -opt

LIT test

LIT test is a test framework in MLIR which can help to track the differences of each IR after single or multiple passes.

cd $TOY_COMPILER_HOME_DIR/build-x86_64
cmake --build . --config Release --target check-toy-lit

Then the expected outputs like as

[4/5] Running the toy lit tests

Testing Time: 0.04s
  Passed: 2

Unit test

Unit test based on GoogleTest framework is applied in this project. unit-test can be run directly, and the output likes as

[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from MLIR
[ RUN      ] MLIR.BuiltinTypes
[       OK ] MLIR.BuiltinTypes (0 ms)
[----------] 1 test from MLIR (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[  PASSED  ] 1 test.

Examples

More details can be found on page examples.

Benchmarks

More details can be found on page benchmarks

toy-compiler's People

Contributors

leichenintel avatar sabriana420 avatar

Stargazers

 avatar Li Yihao avatar yingyue li avatar haiyun,hong avatar cksac avatar

toy-compiler's Issues

Code Refactor

There are many redundant codes now. Need a refactor to make code clean.

Add Element-wise Multiplication

Add the operation with a 2-char symbol '.*'. This follows the format used in MATLAB. The single '*' is left as a matrix multiplication operation.

Fix warnings

[Ubuntu] warning: missing field 'stride' initializer

Datatype Conversion Solution

After multiple precision feature supported, need a solution to convert data between various precision, i.e., FP16->FP32 and FP32->FP64...

Constant + Reshape combine issue

When Reshape has different shape for input and output value, i.e.,
toy.func @test_splat() {
%0 = toy.constant {value = dense<5.500000e+00> : tensor} : tensor
%1 = toy.reshape(%0) : tensor -> tensor<2x2xf64>
toy.print(%1) : tensor<2x2xf64>
toy.return
}
canonicalize pass cannot be applied in toy-opt.
./toy-opt test.mlir --canonicalize

Or the error will be triggered,
Assertion failed: newType.getNumElements() == curType.getNumElements() && "expected the same number of elements"

Need a solution to handle these kinds of cases.

Create a Place to Reserve Inputs Information

In the Toy tutorial, function is created as,

def main() {
  var a<2, 3> = [1, 2, 3, 4, 5, 6];
  var b<2, 3> = [1, 2, 3, 4, 5, 6];
  var c = a + b;
  print(c);
}

but this process should be optimized in the constant fold process.
A more practical construction likes as,

def main(a) {
  var b<2, 3> = [1, 2, 3, 4, 5, 6];
  var c = a + b;
  print(c);
}

and the dumped IR is,

module {
  toy.func @main(%arg0: tensor<*xf64>) {
    %0 = toy.constant {value = dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64>} : tensor<6xf64>
    %1 = toy.reshape(%0) : tensor<6xf64> -> tensor<2x3xf64>
    %2 = toy.add(%arg0, %1) : tensor<*xf64>, tensor<2x3xf64> -> tensor<f64>
    toy.print(%2) : tensor<f64>
    toy.return
  }
}

In this stage, need to create a place to reserve inputs information of main function for static compiling. (If the shape and type are unknown, this is another story of dynamic compiling).

Test Benchmark

Need a benchmark to validate the quality of each optimization pass.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.