Code Monkey home page Code Monkey logo

Comments (8)

SubjectNoi avatar SubjectNoi commented on May 25, 2024

export HL_NUM_THREADS=x may not satisfy my requirement since I want to execute my kernel in a parallel way (Thus a global env may not enough to control the core affinity of the kernel, as I understand).

from tiramisu.

rbaghdadi avatar rbaghdadi commented on May 25, 2024

Hi,

Tiramisu uses Halide as a backend so any method that Halide uses to solve this problem should also apply to Tiramisu (given that parallelization is handled by the Halide runtime).

When you used halide_set_num_threads(4) you are using it in the code generator. You actually should use it in the wrapper code that calls the tiramisu generated code, because this call affects the Halide runtime. I have never used it so not sure if it will work or not.

The other method is to apply a loop split on the loop that you want to parallelize. Let's say you want to parallelize on X codes, then you can the outermost loop with a factor of N/X assuming N is the size of the outermost loop. If you do this, the new outermost loop will have X iterations and so if you parallelize it and if you have X cores then each thread will take one iteration.

from tiramisu.

SubjectNoi avatar SubjectNoi commented on May 25, 2024

I've tried halide_set_num_threads(4) in wrapper code under single thread scenerio, and it worked, thanks. And is it possible to set core number by this API in OpenMP parallel sections? or in multiple std::thread?

from tiramisu.

rbaghdadi avatar rbaghdadi commented on May 25, 2024

This call controls the Halide runtime which plays the same role as OpenMP.

Basically the Halide runtime is responsible for parallelizing the loops and this call will control how it is doing that.

I don't know how it will interact with OpenMP or the native thread library used in C++.

If this is important to you, the best thing would be to look at the Halide runtime code and see what does it do exactly, or maybe ask this question to the Halide community. They might be able to better help on this.

from tiramisu.

SubjectNoi avatar SubjectNoi commented on May 25, 2024

OK, thanks a lot.

from tiramisu.

SubjectNoi avatar SubjectNoi commented on May 25, 2024

Sorry for reopen this issue, I wonder is there an API to dump the origin C code of generated object file? Still should I check the Halide API since Tiramisu use Halide as a backend as I understand?

from tiramisu.

rbaghdadi avatar rbaghdadi commented on May 25, 2024

Yes, we generate Halide IR and use Halide to lower this IR. Halide has a C code generator. You need to look at the function tiramisu::codegen() and try to find the call to Halide to ask Halide to generate an object file. You need to slightly modify the options of that function to make it generate C (this is an option to the Halide code generation function). To figure out how to generate C in Halide, you need to look at the Halide documentation.

from tiramisu.

SubjectNoi avatar SubjectNoi commented on May 25, 2024

It's very appreciated for your rapid reply. Finally, I manage to use MPI to launch multiple tiramisu kernels with my expected CPU core to use. It's my assumption that multiple threads (OMP, std::thread) will share one JIT execution engine so the two kernels will still act in a serialized manner (despite I launch them simultaneously by parallel sections), however, for multiple processes (MPI), they will invoke independent execution engines thus I can actually launch and execute them simultaneously.

from tiramisu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.