so what exactly do the memory and token settings in the endpoint menu do? I’ve noticed that, with meta-llama and a high amount of memory, I’ll be getting pretty good results, but then, after a while, the AI will lock up with an error message that it can’t connect and I’ll have to let it ‘cool down’ for a while before I can come back and keep going.