Post by Godnoken in [GUIDE] Japanese <-> English LLAMA

Viewing post in [GUIDE] Japanese <-> English LLAMA

The Q4_K_M model should be faster and lower quality, so these results are very surprising! Must be something wrong with it.
I'm glad the model is proving good enough for your use case, that's excellent news. Quality offline translation for the privacy-concerned is great :)

For this specific model, you could try the F32 (1.42GB) and F16 (711MB). I think both likely will be too slow for you, but seeing as the Q8_M model seems broken, I'd try F16 first. Maybe even try the Q6_K.

There are other models out there, but I haven't had time to try any other EN-JP models. Not sure if there are any better ones.

Thank you for testing it out, much appreciated feedback!

Charem81 days ago

How strange! I kept both recommended models around in case I feel like further testing... But yeah, strange. Perhaps I should run both against a larger group of games eventually.

Thanks for the additional suggestions, I'll play with those three later too then~

itch.io

Viewing post in [GUIDE] Japanese <-> English LLAMA