Hi Chris. Yes! I’d love to measure performance in some of my systems, please let us know if you setup a Discord group for that specifically.
On the other hand as you said , sampling processes takes up CPU and resources for sure, which it adds some bloat too, and also adds for some decreased performance. I was just way too curious comparing the earlier and the newer build.
Thanks for the little explanation on the C2P (Chunky-2-Planar) conversion algorithm, and how things work under the hood. It is always interesting to see and understand better how things work in the background. On the foreground, it is amazing to see way better performance on my A4000 - A3660-based system (‘060 @50MHz, no accelerator FastRAM), which is more constrained by the slower RAM transfers using motherboard RAM which translates now, with these optimizations, on an overall better experience. It’s great! — Thanks again.