This is a screenshot (stanford voxel bunny) of raymarching code (see previous post) running in software for debugging purposes at 256x224 resolution, no multithreading, and yet I'm getting 30+ frame per second and it looks good.
You may ask, "Why?" Debugging compute shader code is much easier when run on CPU (no multithreading) compiled from C++ code.
I'm now porting the code back to compute shader code in my Voxel raymarching engine.