What's the response time like? It usually takes 30-60 seconds for messages to generate with my specs, if its faster than the time I'm getting then I'll definetly start using it.
It shouldn't take more than a few seconds with your specs. Which model are you using? Is your GPU AMD or Nvidia? Are you using the new version 1.6.6b with optimized memory usage?
I imagine Gemma-4-Sparse should be very fast on any Vulkan GPU. If you're using Qwen 3.5 with a version before 1.6.6b, it might be slow because of that.