Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines
(1 edit)

Unsure whether to count this as a bug or not but the AI likes to repeat... like alot. Here are the 3 main examples of what I mean:

- The AI will often start with "As you, *insert player name here*, stand in *insert location here*,..." and will do so every page. If you're with an NPC, it'll also add their name and rank or species in said intro.

- Oftentimes the AI will start the next page with a partially summarized version of the last page, or just add the entirety of the last page in the new one.

- The AI will sometimes explain what is happening twice or more. Example: "As you stab NPC in the hand, you stap her hand... *4 lines later* ... and you stab her right in the hand, piercing her flesh..." (This is not a new one but still figured i should say it) 

(2 edits)

this is a common issue with smaller AI models like mistral nemo, you should use a bigger model such as one of the free models on OpenRouter. A quick fix is to edit the AI message to remove the redundancies (if you leave the repeating stuff in, the AI will look at your message history and likely write more repeating stuff later on…)

FieryLion summed it up pretty well. I've played with models under 20b parameters mostly, and they repeat themselves *frequently*. They'll basically blueprint or template part of a message and just keep pasting it back in like it's boilerplate. I had pretty good success with a 30b parameter Gemma fork running locally (I'm on a 2070 Super with 8GB VRAM), although I only get 3-5 tokens per second (super, super slow). But the dialog quality was absolutely stunning compared to the 8b and 14b models. Qwen 2.5 is listed in several places as a "potato-friendly" model, but it's very unsophisticated: it can't really handle lewd/suggestive content in anything that would titilate or suspend disbelief, it is very repetitive and subject to "GPT-isms", and even if you set it for high temperature and sampling (to add a lot of randomness and creativity), it's still heavily constrained by how small it is.

I _personally_ don't like running AI through a hosted endpoint, even though I could be using some really good hardware and getting really fancy models out of it. My two main reasons are privacy and cost. But if you want better quality responses, you're going to have to either pay for much better hardware or hosting solutions (such as OpenRouter).

I think I found the issue! The small AI model even though it supports up to 128K memory limit in theory, the AI will rapidly degrade after around 8k, you might be able to fix this behavior if you reduce the memory from 10k back to 5-6k or so