Well, it works now, so that's what counts :)
I don't really use LLaMA because I haven't found a model that fits my needs or that my machine can handle. However, I just did a quick test and here's what I found: it's fast at translating single words or short sentences, but it gets a "Request timeout" when trying to translate longer, multi-sentence texts. This isn't a huge problem for me, though. If I really need to, I can run the same model through LM Studio on my machine. When I do that, it handles longer, multi-sentence texts just fine and even seems faster. I've never gotten a "Request timeout" with the smaller models I've tried there.
This is really useful, and if it wasn't fast enough before, it definitely is now. Thanks for that!
Keep up the great work (๐๐ปแด _แด)๐๐ป

