OpenCL is not suitable for DSP applications due to unpredictable latencies of task queuing.
The performance of the library can be heavily increased by getting rid of OpenCL and using remoteproc directly instead.
Currently, every DSP operation is executed once, thus returning after each execution. This leads to delay times and (audio) clipping, when parameters are changed during DSP operation. The problem can be solved by making use of OpenCL workgroup queuing of DSP tasks (e.g. create bigger task workgroups and use flag, when filling/reading buffers => continuous DSP operation without returning.
Filter Biquad delay coefficients have to be calculated before first task execution to avoid clipping, when changing parameters in realtime. More information available here.
Currently the multichannel crossover demo app only works with a generated sine. When using realtime audio input via JACK the output of biquad filter is always the same.
Code of demo app can be found here.