I'm currently looking into different pixelflut implementations, and I like yours a lot. I am missing one thing so far, that lets me tend to the python implementation. I was wondering if this could be implemented.
This python implementation can periodically (i think every second) store the image to disk. This is very useful to me, because I create timelapse-videos after a pixelflut session.
Right now you're spawning a thread for each cpu core (with thread::spawn(...) and executing a worker on it. Then for each worker you create a tokio cpu_pool with 8 (hardcoded) threads. This means that you will, on an 8 "core" system, spawn 64 threads, meaning a lot of context switching.
The tokio cpu_pool is actually designed to scale Async I/O futures across available CPU cores. Removing the second layer of thread pooling decreases memory consumption by about half and increases throughput (on my laptop, binding to 127.0.0.1 by about 30%