Comments (3)
Hi @vesuppi! Good question, as this isn't really documented all that well! For Python applications, we have the PythonKernel from kernel_tuner.kernelbuilder. This example shows the simplest way to use it:
https://github.com/KernelTuner/kernel_tuner/blob/master/examples/cuda/python_kernel.py
The idea is that you can either directly specify which configuration you want with the 'params=' option of PythonKernel. For example you could use get_best_config from kernel_tuner.util and pass that as the params.
A probably better way to do this is to let Kernel Tuner figure out which configuration to use based on the tuning results of tune_kernel that have been stored to a file. In that case you have to use the 'results_file=' option of PythonKernel and point to a "results file" written using the store_results function in kernel_tuner.integration.
This may seem like an additional step, but this enables you to only tune once, store the results, and then run the application many times reusing the same tuning results. The selection for which kernel configuration to compile is made based on the GPU available at run time and the specified problem size.
from kernel_tuner.
I see, thank you very much for the detailed explanation! I was able to get the best config using
results, env = tune_kernel("vector_add", kernel_string, N, (c, a, b, torch.tensor(N)), tune_params)
best_config = util.get_best_config(results, 'time')
Haven't tried PythonKernel yet, but will do! Another side question, if we want to tune the size of the thread block, does the parameters have to be named "block_size_x" and "block_size_y"? Thanks!
from kernel_tuner.
You can indeed use other names for the thread block dimensions. You can specify the names of these using the block_size_names=
option of tune_kernel. This optional argument takes a list of strings with the names for the x, y, and z thread block dimensions.
This test illustrates how to use this option:
https://github.com/KernelTuner/kernel_tuner/blob/master/test/test_runners.py#L65
The example kernel in this test uses 'block_dim_x' instead of 'block_size_x', but you can change it to anything you like.
from kernel_tuner.
Related Issues (20)
- UnboundLocalError: local variable 'bo' referenced before assignment HOT 2
- Migrate to KernelTuner GitHub org HOT 4
- Versioned documentation HOT 2
- OO interface
- framework_time is growing over time HOT 4
- redesigning the readme HOT 1
- Instance won't be None HOT 2
- Error in expdist.py HOT 2
- C compiler version parsing error HOT 3
- Refactoring cache files HOT 2
- matrix multiplication tutorial in the documentation HOT 1
- [bug] Parameter `objective_higher_is_better` in `tune_kernel` call cannot be manually specified HOT 2
- observer documentation issues
- brute force strategy to respect max_fevals
- `ValueError: name is not a valid identifier: {k}`
- Remove dependence on `OrderedDict`? HOT 4
- Special memories only work with PyCUDA backend HOT 9
- Populating pointers of struct from cupy HOT 2
- can we add support for build systems HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kernel_tuner.