Heya,
I've done further testing and I believe that most of the problems I've had with getting sensible output for LLM models is that they seem oddly specific in how you need to prompt for translations (and also not specific at all...)
For example;
Using this model for Japanese to English: https://huggingface.co/lmg-anon/vntl-gemma2-2b-gguf
This is their example prompt;
<<METADATA>> [character] Name: Uryuu Shingo (瓜生 新吾) | Gender: Male | Aliases: Onii-chan (お兄ちゃん) [character] Name: Uryuu Sakuno (瓜生 桜乃) | Gender: Female <<TRANSLATE>> <<JAPANESE>> [桜乃]: 『……ごめん』 <<ENGLISH>> [Sakuno]: 『... Sorry.』<eos> <<JAPANESE>> [新吾]: 「ううん、こう言っちゃなんだけど、迷子でよかったよ。桜乃は可愛いから、いろいろ心配しちゃってたんだぞ俺」 <<ENGLISH>>
I tried to prompt it like this before;
<<TRANSLATE>><<JAPANESE>>%text%<<ENGLISH>>
But it gave Japanese responses about 50% of the time and would also use a shitton of tokens sometimes.
Changing it to;
<<TRANSLATE>>\n<<JAPANESE>>\n%text%\n<<ENGLISH>>\n
And voila, works pretty much just like intended. Although it seems to disregard translating parts of paragraph text sometimes. (It is much easier to see why \n could do any difference if you go to the link. The formatting on here breaks the structure.)
Even just putting;
Translate the following sentence from English to Japanese:\n\n%text%\n\nTranslation:
As the prompt worked much better than the first option.
In regards to the other API issues - I'm working on it. Might have fixed it but I need to do a little bit more testing tomorrow before releasing a beta version. :)