Skip to main content

On Sale: GamesAssetsToolsTabletopComics
Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

I'm very much looking forward to testing it as soon as I can, probably tomorrow or Thursday. 


Thank you very much for your continuing work on this project!

Didn't actually expect a reply! Glad you're still around :)

I just reread the entire thread and now remember what you sought exactly. What I've added (in v0.5.8-alpha.1) is the possibility to use either/both the Manual and Automatic mode to have several windows (max 3 for performance reasons) running at the same time. The Automatic mode works just like before, so it will stay in place, remove the text and then display the text in-place.

I do understand, from reading this thread completely, that what you ideally would want is a completely separate panel where the translations show up automatically after selecting an area to translate. This does unfortunately still not exist. However, despite my negative outlook on this type of feature several months prior, I'd say I'm convinced and have changed my mind about it. There are indeed use cases for this, and I would like to add this type of translation window as well. I promise I will have a look at it, but I'm in the middle of prepping for a Steam release so it may not come within the next few weeks. We'll see.

Until then - I recommend you to go to the 'General' tab;

1. Disable "Show automatic border"

2. Enable "Locked window"

3. Enable "Click through"

These settings should help make the "Automatic" mode less disturbing to the gameplay experience.

Do have a look in the Keybindings section a well, as there are several keys there to make things more modular when in-game.

Well, I'll definitely use the Steam release as an opportunity to give you a bit more money for this system and add to its visibility through sales-figures.

It's been very useful as-is for my needs (particularly after you added the OpenAI-compatible LLM interface; that works really well with an 8B Aya-Expanse setup that I have behind another proxy that maintains message-history for better context, though that could conceivably be integrated here if each individual translation panel can maintain its own state (%message[0]% for most recent, %message[1]% for second-most, with enough JSON awareness to delete the encapsulating object if a message is undefined) or perhaps provide an ID substitution-variable (%panelId%) so my proxy could multiplex, though I think I could get good enough results by just retaining history based on minimum input length)

I don't necessarily want a separate panel for translations to be aggregated, but it seemed like an easy shortcut for implementing multiple capture-points if you couldn't refactor the implementation: specify multiple regions to scan, have everything just dump to a terminal window instead of trying to redraw over the UI elements, and let the user figure out which is which. From what you've said, though, it sounds like you've built something better and more sophisticated, so this lazy approach probably isn't needed.

Thank you, I really do appreciate that very much! :)

I'm glad to hear that it has been very useful for you; that's what keeps the light on for me to keep working.

That would for sure be a cool thing to implement for more advanced uses like yours - So what you're suggesting is to provide each window's ID and/or message ID as a custom entry in the JSON body, or just in the prompt message itself?

Like:

{

    "prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a highly skilled professional translator.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n### Instruction:\nTranslate %source% to %target%.\n\n### Input:\\n%text%。\n\n### Response\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n",

    "max_new_tokens": 256,

    "temperature": 0,

    "top_p": 1,

     "do_sample": false,

     "window_id": "%window_id%"

}

or

{

    "prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a highly skilled professional translator.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWindow: %window_id% ### Instruction:\nTranslate %source% to %target%.\n\n### Input:\\n%text%。\n\n### Response\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n",

    "max_new_tokens": 256,

    "temperature": 0,

    "top_p": 1,

     "do_sample": false

}

Not sure if the first option is even possible, but it'd be a bit cleaner maybe.

I do agree with the whole idea of a separate panel though, I think it is something that should exist in the app, especially since the text removal algorithm isn't always reliable (and uses extra processing power..!).

Sorry for the very slow reply, I've been extremely busy. Thank you for your valuable feedback! :)

(2 edits)

No need to worry about how long replies take. Really. (I didn't get around to testing last week due to other things coming up, too, though I *will* do it this week)


As for the ID thing, yeah, it's more like your first example, just getting an extra value that can be passed along with the request for anyone who has a weird setup like mine. I don't think the LLM should ever see it, and some APIs might break because it's an unknown field. I also don't care if it's stable between sessions: it can be an integer or a UUID or something user-definable; my use-case and design just wants a way to differentiate sources.


What I'm doing is intercepting every JSON post, adding them to a list, making a Chat Completion request based on that, and returning the result.


I have GameTranslate post something simple like this:

```

{

  "text": "<captured data>"

}

```


And then a small Python script is building the real request and sending back the response from a llama.cpp server:

```

{

  "model": "aya-expanse",

  "temperature": 0.65, //other parameters, too, but they're not important to illustrating the idea

  "messages: [

    {

      "role": "system",

      "content": [{

        "type": "text",

        "text": "<my-system-prompt, basically telling it that it's a casual translator that is willing to use slang and assume a conversational tone>"

      }]

    },

    {

      "role": "user",

      "content": [{

         "type": "text",

         "text": "<old captured data, held in memory, each as their own message, to build chat-history; a separate request to this proxy resizes the list's maximum capacity, letting me reset it on scene-changes or when more or less history is appropriate; another lets me drop recent messages, in case I don't want them influencing the next production>"

       }]

    },

    {

      "role": "user",

      "content": [{

         "type": "text",

         "text": "Translate the following into English without adding any commentary or explanations:\n\n<new captured data>"

       }]

    },

  ]

}

```

The response is intercepted and the newly captured data is appended to the "choice" block so I can see the translation and the original text when the overlay is opaque (I also sometimes add a kana line here). And then that slightly modified result is returned to GameTranslate to be presented.

Having the window-ID as an optional parameter that I can set in the initial request would let me easily direct the new text to an appropriate message-history list so that the main dialogue box doesn't get mixed up with a speaker-string. People who just pass things directly simply wouldn't set it, so there would be no impact on working setups.


I do think you could integrate something like what I'm doing into GameTranslate itself (with a bit less flexibility: no control over runtime buffer-tweaking because that would be very hard to present intuitively), but as illustrated by your use of Text Completion (which, yes, is more powerful), by now, people have surely created their own workflows around this feature. Just having REST-JSON opened up a ton of possibilities.