Comments (8)
export HL_NUM_THREADS=x may not satisfy my requirement since I want to execute my kernel in a parallel way (Thus a global env may not enough to control the core affinity of the kernel, as I understand).
from tiramisu.
Hi,
Tiramisu uses Halide as a backend so any method that Halide uses to solve this problem should also apply to Tiramisu (given that parallelization is handled by the Halide runtime).
When you used halide_set_num_threads(4) you are using it in the code generator. You actually should use it in the wrapper code that calls the tiramisu generated code, because this call affects the Halide runtime. I have never used it so not sure if it will work or not.
The other method is to apply a loop split on the loop that you want to parallelize. Let's say you want to parallelize on X codes, then you can the outermost loop with a factor of N/X assuming N is the size of the outermost loop. If you do this, the new outermost loop will have X iterations and so if you parallelize it and if you have X cores then each thread will take one iteration.
from tiramisu.
I've tried halide_set_num_threads(4) in wrapper code under single thread scenerio, and it worked, thanks. And is it possible to set core number by this API in OpenMP parallel sections? or in multiple std::thread?
from tiramisu.
This call controls the Halide runtime which plays the same role as OpenMP.
Basically the Halide runtime is responsible for parallelizing the loops and this call will control how it is doing that.
I don't know how it will interact with OpenMP or the native thread library used in C++.
If this is important to you, the best thing would be to look at the Halide runtime code and see what does it do exactly, or maybe ask this question to the Halide community. They might be able to better help on this.
from tiramisu.
OK, thanks a lot.
from tiramisu.
Sorry for reopen this issue, I wonder is there an API to dump the origin C code of generated object file? Still should I check the Halide API since Tiramisu use Halide as a backend as I understand?
from tiramisu.
Yes, we generate Halide IR and use Halide to lower this IR. Halide has a C code generator. You need to look at the function tiramisu::codegen() and try to find the call to Halide to ask Halide to generate an object file. You need to slightly modify the options of that function to make it generate C (this is an option to the Halide code generation function). To figure out how to generate C in Halide, you need to look at the Halide documentation.
from tiramisu.
It's very appreciated for your rapid reply. Finally, I manage to use MPI to launch multiple tiramisu kernels with my expected CPU core to use. It's my assumption that multiple threads (OMP, std::thread) will share one JIT execution engine so the two kernels will still act in a serialized manner (despite I launch them simultaneously by parallel sections), however, for multiple processes (MPI), they will invoke independent execution engines thus I can actually launch and execute them simultaneously.
from tiramisu.
Related Issues (20)
- W add autodiff like Halide?
- Will you plan to add autodiff like Halide? HOT 1
- Does tiramisu support FPGA as backend hardware now? HOT 3
- Is there any methods in Tiramisu for parallelizing or loop tiling that automatically resolves data dependency? HOT 3
- The link in the readme to a VirtualBox VM is broken HOT 1
- What's the difference between Tiramisu and TACO compiler HOT 1
- unstructured weight sparsity mentioned by the paper HOT 2
- Build a Python extention HOT 1
- Can I adjust the CPU core number in Tiramisu compiler? HOT 1
- [Bug] CPU convolution sample in benchmark runfailed when setting BATCH_SIZE=1 HOT 1
- questions about tiramisu capabilities HOT 3
- Deep Learning Based Cost Model HOT 2
- Trouble With Compiling Dependencies HOT 2
- No module named 'TiramisuCodeGenerator'
- How to implement XOR operator of a expression HOT 3
- Run distributed test HOT 1
- Any luck building the autoscheduler tutorial with the latest Halide
- Conversion from Tiramisu DSL to C Language For-loop Code HOT 1
- failed to build tiramisu HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tiramisu.