Post by Lakis in Release 1.7.0 - Bug Reports

Viewing post in Release 1.7.0 - Bug Reports

A continuation of the error.

Linux Mint 22, 64GB DDR5, AMD RYZEN 9 7950X , AMD Radeon 6900 XT w/16gb VRAM.
Running LLM service locally via machine instead of third party.

Error: The LLM service has encountered an Error: An error occurred while sending the request.

Tried a few things, rolling back to an older and the unstable branch of Kobold-CPP just to rule that out, no joy. Going back to the Qwent Legacy 16gb model still works just fine, so for now that's the model I'm using.

Three Eyes Software75 days ago

There's a log file at ~/.config/unity3d/Three Eyes Software/Silverpine/Kobold.log now.

Lakis75 days ago

Link to fresh Kobold.log on google drive since I can't upload the file specifically.
https://drive.google.com/file/d/1fdCcfvb74Jubq3LcpijvfoPXI2eS5gg5/

Hopefully this will shed some light on the situation.

Three Eyes Software75 days ago

Your link requires a login. Use Pastebin or just post it here.

Lakis75 days ago

https://pastebin.com/GjXxt3Wt

Three Eyes Software75 days ago

The model itself is working completely fine. This seems to be a problem related to KoboldCPP's tokencount API. I'm sort of at a loss why this would be happening with specifically Gemma 4 series models.

The changelog for koboldcpp-1.112.2 says "Fix for /api/extra/tokencount". I see that you've manually updated to 1.113, but for sanity, try it again with 1.112.2 specifically. If this doesn't fix the issue I will add a fallback that simply guesses the token count if the API fails.

Lakis74 days ago

Yeah, it's still a no-go.

Three Eyes Software73 days ago (1 edit)

I've uploaded a patch that should fix it. Now I'm also curious where you got the log in your original post from, seeing as logging should be disabled and there is no user-facing way to copy errors that happen during dialog.

Lakis73 days ago (4 edits)

I ran it via the Terminal and had it pipe into a txt file.

./Silverpine.x86_64 < /silverpine-adhoc-line.txt

Edit: Like so.

I don't show the game itself running, but I do show that the log does capture it from a new document.

Lakis72 days ago (1 edit)

Heads up, hotfix 1.7.0c, even with the token fallback, still doesn't resolve this particular issue.

Pastebin below.
https://pastebin.com/FjVVi1CE

In this log, I started a new game, attempted to talk to multiple NPCs. Second NPC I was informed that the fallback would be utilized for the token count.

Three Eyes Software72 days ago

The log shows that it fails during text completion too. This is some sort of mystery OS/driver/backend issue, since the only thing the game does differently between sending requests to Qwen and sending requests to Gemma is that it launches the backend with SWA enabled on Gemma.

You could try letting the game load the model, then killing the hidden Kobold process after it's done loading the model, and run this command in the StreamingAssets folder to "switch out" the backend process for one without SWA:

./koboldcpp-linux --model "Gemma-4-Sparse.gguf" --usevulkan --gpulayers -1 --quiet --multiuser 100 --contextsize 4096 --skiplauncher --port 5003 --nofastforward

If SWA isn't what causes the problem, I'm out of ideas. It's difficult for me to debug what's happening here because I don't have an AMD system to reproduce the bug with, if that's even the cause.

View more in thread

itch.io

Viewing post in Release 1.7.0 - Bug Reports