Post by Lakis in Release 1.7.0 - Bug Reports

The model itself is working completely fine. This seems to be a problem related to KoboldCPP's tokencount API. I'm sort of at a loss why this would be happening with specifically Gemma 4 series models.

The changelog for koboldcpp-1.112.2 says "Fix for /api/extra/tokencount". I see that you've manually updated to 1.113, but for sanity, try it again with 1.112.2 specifically. If this doesn't fix the issue I will add a fallback that simply guesses the token count if the API fails.

Lakis27 days ago

Yeah, it's still a no-go.

Three Eyes Software26 days ago (1 edit)

I've uploaded a patch that should fix it. Now I'm also curious where you got the log in your original post from, seeing as logging should be disabled and there is no user-facing way to copy errors that happen during dialog.

Lakis26 days ago (4 edits)

I ran it via the Terminal and had it pipe into a txt file.

./Silverpine.x86_64 < /silverpine-adhoc-line.txt

Edit: Like so.

I don't show the game itself running, but I do show that the log does capture it from a new document.

Lakis26 days ago (1 edit)

Heads up, hotfix 1.7.0c, even with the token fallback, still doesn't resolve this particular issue.

Pastebin below.
https://pastebin.com/FjVVi1CE

In this log, I started a new game, attempted to talk to multiple NPCs. Second NPC I was informed that the fallback would be utilized for the token count.

Three Eyes Software25 days ago

The log shows that it fails during text completion too. This is some sort of mystery OS/driver/backend issue, since the only thing the game does differently between sending requests to Qwen and sending requests to Gemma is that it launches the backend with SWA enabled on Gemma.

You could try letting the game load the model, then killing the hidden Kobold process after it's done loading the model, and run this command in the StreamingAssets folder to "switch out" the backend process for one without SWA:

./koboldcpp-linux --model "Gemma-4-Sparse.gguf" --usevulkan --gpulayers -1 --quiet --multiuser 100 --contextsize 4096 --skiplauncher --port 5003 --nofastforward

If SWA isn't what causes the problem, I'm out of ideas. It's difficult for me to debug what's happening here because I don't have an AMD system to reproduce the bug with, if that's even the cause.

Lakis25 days ago (1 edit)

No joy, segmentation faults.

Looking in the adhoc error logs seems something is trying to go well above the normal buffer size and then that's where it stops.

Pastebins are below.

https://pastebin.com/8U80du1R -- Silverpine adhoc log I kept just incase anything was captured there.

https://pastebin.com/XxSBWaku -- Kobold-CPP Adhoc log, config folder log didn't update when running from CLI, so I kept my own log.

Looking in my sys-info I can see the CoreDump, viewing that it just says Segmentation Fault and then provides a Memory Reference pointer.

Core was generated by `./koboldcpp-linux --model Gemma-4-Sparse.gguf --usevulkan --gpulayers -1 --quiet --multiuser 100 --contextsize 4096 --skiplauncher --port 5003 --nofastforward'.

Program terminated with signal SIGSEGV, Segmentation fault.

#0 0x00007f0e54ca095c in ?? ()

Which, that isn't exactly useful. Either way, I think I'll call it here and just keep using Qwen when I feel the urge to play!
Edit: Thank you for taking the time to try and help. I really appreciate it!

itch.io

Viewing post in Release 1.7.0 - Bug Reports