Thanks for this guide! I ran a bunch of tests and can give some results, maybe it'll help others here too.
I experimented with the test text input (nice function, I didn't notice it there before!) and tossed a bunch of somewhat-tricky Japanese proverbs at both of the two models you recommended. I seemed to have both the most accurate generations with the Q4_K_M model, which took 3-4 seconds to work; Q8_O model was faster by like a second but also notably-worse. I attached both models to a Japanese game I was running and got similar results to the tests, so Q4 just seems superior and doesn't at all intensive even for my average-power computer.
Dumping screengrabs of the game and putting it into Google Translate is still a little more accurate than Q4, but Google Translate is also way more of a hassle and notably slower with alt-tabbing required. Q4 is certainly getting the job done and being accurate enough to play a game comfortably I would say. (I tested with a Japanese RPG and got a lot of varying NPCs to test with, and I got the idea of everything everyone was saying. I threw their words into Google Translate too and it was the same information, sometimes a bit more accurate but either way, it works.)
Notably and surprisingly to me: One thing Q4 did better than Google Translate at (and this is likely a credit to the image-recognition you put into GameTranslate too) was translating fancy, tilted text. The RPG in question that I was testing with tilts all character names at the top of their oddly-shaped dialogue boxes; Google called one NPC 'Penguin Hote Clerk' while Q4 more-correctly read the text as 'A Penguin Hotel Employee'.
If you have any other suggestions, I think I'd personally be interested in another more-intensive and more-accurate model than Q4, seeing that Q4 doesn't seem to push my system that much at all. (But I'm also happy sticking to Q4 for now too.)