Glad to hear it's better! I really need to get this update out to users. But there is one bug I know about that before I can launch.
Linux AMD ROCm support was just in case I had any users, I wanted to make sure it was awesome for them! Glad to hear that day happened faster than I expected.
Will definitely add some more models, I'm pretty behind. Any specific suggestions? Would love to hear what the best stuff is nowadays.
I will learn more about IQ quants and the KV cache offloading. Is that suggestion for the local LLMs, or the cloud-hosted ones?
Anyways, happy it's better. If you want to chat more, I'm hammer_ai on Discord - would be fun to chat more about finetunes to add / any other suggestions you have.