try to increase the memory limit, default 2000 is too little. You can try to use your own AI or OpenRouter to use an AI with much higher memory
so what exactly do the memory and token settings in the endpoint menu do? I’ve noticed that, with meta-llama and a high amount of memory, I’ll be getting pretty good results, but then, after a while, the AI will lock up with an error message that it can’t connect and I’ll have to let it ‘cool down’ for a while before I can come back and keep going.