Should be fixed now.
Three Eyes Software
Creator of
Recent community posts
For advanced setups like this that are not supported by the game, you must start KoboldCPP manually with your desired parameters, and then start the game, which will automatically detect the running process and start communicating with it. This is only available on Windows. The game will assume the process it's hooking into is running the same model last selected after the "Select an AI model for more information." dialog.
How much VRAM do you have? If it's 12 GB or 16 GB, I highly recommend using Mistral-Small-3.2 instead. Before, there were manual offload values for Gemma that were very much guestimates, which I got rid off, so the game now asks the backend to determine the correct amount, which it probably fails to do for some reason in your case.
Please read the post again carefully:
If you bought the game before the time of this post
The whole reason I made this post is because hosting the server outside of the demo turned out to not be sustainable.
I'm merely honoring the expectation I created for people who bought the game before it. Please run the AI locally instead. If you can't run it locally, use OpenRouter.

The game shows you an error that tells you what exactly failed during the conversion. If your character is called "Example Custom Character" it will also not be converted.
Alternatively, you can simply move the folder of your custom player character definition somewhere else, and then manually recreate it in the new version's editor using the assets inside.
If the curl command fails, the issue of connecting to localhost is related to something else on your system, not the game itself. Perhaps there's another program already running on port 5001.
As for the slowness, the game is correctly offloading all layers to the GPU. I can't fully gauge the performance from the two successful API calls you posted since they have very few input/output tokens, but I've uploaded a new version that makes 5000 series RTX GPUs use a special backend setting, which when not selected, resulted in much slower (though not minutes long) processing during my testing.
Open a command prompt in Silverpine_Data\StreamingAssets\KoboldCPP like this:

Then run this command:
koboldcpp.exe --model "Mistral-Small-3.2.gguf" --usecublas --gpulayers 999 --quiet --multiuser 100 --contextsize 4096 --skiplauncher
Then post this part of the output:

Then run this command in a separate command prompt:
curl -X POST "http://localhost:5001/api/v1/generate" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"max_context_length\": 4096,\"max_length\": 100,\"prompt\": \"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris laoreet nunc non vehicula accumsan. Etiam lacus nulla, malesuada nec ullamcorper vitae, malesuada eget elit. Cras vehicula tortor mauris, vitae vulputate est fringilla ac. Aenean urna libero, egestas eget tristique eget, tincidunt sit amet turpis. Pellentesque vitae nulla vitae metus mattis pulvinar. Suspendisse eu gravida magna. Nam metus diam, fermentum mattis pretium vestibulum, mollis non sem. Etiam hendrerit pharetra risus, vitae fermentum felis hendrerit at. \",\"quiet\": false,\"rep_pen\": 1.1,\"rep_pen_range\": 256,\"rep_pen_slope\": 1,\"temperature\": 0.5,\"tfs\": 1,\"top_a\": 0,\"top_k\": 100,\"top_p\": 0.9,\"typical\": 1}"
After a while something like this should pop up in the first command prompt:

Please post it too. I should be able to figure out the issue then.
Really you're not meant to edit the exported NPC memories at all, and I can't imagine editing them not causing confusion in some way. NPCs receive information by being part of a conversation, or by having information shared with them by another NPC.
I see that people like having overarching plotlines, so I'd rather have a system that handles this better, than having people edit the exported memories. I'm very much open for suggestions about this if you feel like there's something missing, or that maybe there should be more detailed/frequent/reliable communication.
The field that contains the used actions does nothing if you don't have "Native NPC Actions" enabled. Otherwise it tells the AI what actions were already used. Again, it's not meant to be edited.
I tried changing how the download works so it's more robust and doesn't immediately cancel when it encounters an error on an unstable connection.
Unfortunately, from what I've seen so far, this requires using HttpClient, which is for some reason very slow in Unity (1 mb/s).
You can download the required files manually by following these steps:
- Download https://github.com/LostRuins/koboldcpp/releases/download/v1.81.1/koboldcpp.exe and place it in Silverpine_Data\StreamingAssets\KoboldCPP
- Download https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/resolve/main/Mi..., rename it to "Nemo.gguf" and place it in Silverpine_Data\StreamingAssets\KoboldCPP
- Create a file called koboldcpp.seal and edit it to contain this: {"url": "https://github.com/LostRuins/koboldcpp/releases/download/v1.81.1/koboldcpp.exe","size": 478166467}
- Create a file called Nemo.seal and edit it to contain this: {"url": "https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/resolve/main/Mistral-Nemo-Instruct-2407-Q5_K_M.gguf?download=true","size": 8727635072}

I will release an update when I have figured out how to make the download work better over VPN.
You have to download the ServerKey file and place it in AppData\LocalLow\Three Eyes Software\Silverpine on both your laptop and computer. If it was working before 1.4.1, it's because authentification wasn't actually required until then. If it's not working now, you didn't place the file in the correct spot.


