Code Monkey home page Code Monkey logo

rocmlir's Introduction

MLIR-based convolution and GEMM kernel generator for ROCm

This is the repository for a MLIR-based convolution and GEMM kernel generator targetting AMD hardware. This generator is mainly used from MIGraphX, but it can be used on a standalone basis. (The ability to use this code via torch-mlir is being investigated as well.)

Building (and testing)

To build the system

mkdir build
cd build
cmake -G Ninja .. -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++
ninja check-rocmlir

Note that we require building against a relatively recent clang. The above commands specify the ROCm clang release in order to match our standard development practice.

To not actually run the tests, use check-rocmlir-build-only.

To build the static library that is used by MIGraphX

mkdir build
cd build
cmake -G Ninja .. -DBUILD_FAT_LIBROCKCOMPILER=On -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++
ninja

and to install it so MIGraphX can find it

cmake --install . --prefix [your/MIGraphX/deps/folder/path]

Standalone usage

For usage examples, see mlir/test/rocmlir-driver, especiallly the files sanity.mlir and the contents of the e2e_for_pr directory.

This project also includes code that translates from TOSA to kernels, see mlir/test/fusion for examples of how to invoke it.

In general (with all invocations given from the build directory)

  • ./bin/rocmlir-gen generates high-level convolution operations and host code. Many of the options control data layout, size, etc, but some other useful flags are:
    • -mfma=on (which enables mfma usage) (or -wmma=on for gfx11 targets)
    • -mfma=off (which disables mfma usage) (or -wmma=off for gfx11 targets)
    • -ph (which causes host code to be generated)
    • -pv (which makes the host code validtae the results against a reference)
    • -pv_with_gpu (which uses a GPU validator instead)
    • -pr (which prints kkrnel results)
  • ./bin/rocmlir-driver is a wrapper around the kernel generation pipeline. Use -c (or --kernel-pipeline=full --host-pipeline=runner) to run the default pipeline

The result of this pipeline should, most simply, be passed to the rocm-run script in mlir/utils/widgets//rocm-run, which calls mlir-cpu-runner with the appropriate flags and infers the pathnames for libraries correctly.

In more detail, the result of the above pipeline can be passed to ./external/llvm-project/llvm/bin/mlir-cpu-runner .

mlir-cpu-runner needs to link the generated host code against libraries that map from MLIR operations to the HIP runtime. The required command-line arguments (if running from build/) are

./external/llvm-project/llvm/bin/mlir-cpu-runner --shared-libs=./external/llvm-project/llvm/lib/libmlir_rocm_runtime.so,./lib/libconv-validation-wrappers.so,./external/llvm-project/llvm/lib/libmlir_runner_utils.so --entry-point-result=void

Adding --debug-only=serialize-to-blob to the rocmlir-driver invocation will cause the GCN assembly code for the kernels being executed to be dumped to standard error.

Disabling MFMA/WMMA in tests

By default, we infer the use of GPU-specific acceleration instructions, like MFMA or WMMA, based on the features of the currently available GPU.

To disable this, add -DROCMLIR_GEN_FLAGS="-mfma=off -wmma=off" to the cmake invocations given above. Note that this will not affect behavior in production/static library builds, which do not use rocmlir-gen.

rocmlir's People

Contributors

lattner avatar topperc avatar rksimon avatar espindola avatar tkremenek avatar ddunbar avatar douggregor avatar rotateright avatar arsenm avatar d0k avatar rui314 avatar zygoloid avatar chandlerc avatar isanbard avatar echristo avatar rnk avatar dwblaikie avatar chapuni avatar nico avatar akyrtzi avatar stoklund avatar eefriedman avatar tobiasgrosser avatar resistor avatar maskray avatar labath avatar ericwf avatar kcc avatar majnemer avatar pcc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.