Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines
(1 edit)

I added $25 at Christmas and am nowhere near empty.

What's the response time like?
It usually takes 30-60 seconds for messages to generate with my specs, if its faster than the time I'm getting then I'll definetly start using it.

It shouldn't take more than a few seconds with your specs. Which model are you using? Is your GPU AMD or Nvidia? Are you using the new version 1.6.6b with optimized memory usage?

Its an Intel Arc A770 so I suppose that's why.

I imagine Gemma-4-Sparse should be very fast on any Vulkan GPU. If you're using Qwen 3.5 with a version before 1.6.6b, it might be slow because of that.

I can show you a video/photo of the logs if you'd like, perhaps one of my configs is wrong? I'm not to knowledgeable in this field.

What would you recommend I do/change to be able to use my resources instead of openrouter?

Which model are you running? Did you update to 1.6.6b yet?

I am, I got it the moment it was published

After some research, it seems that A770 is simply not well-supported by llama.cpp. You could try running Gemma-4-Sparse to see if a sparse model is faster, if you aren't already doing that.