Posted July 24, 2025 by Aleksandr Unconditional
#bugfix #build #grok #koboldcpp #ngrok #remote #sd #stable diffusion
✨ stable diffusion:
- changed folder structure: files are now distributed into cuda, darwin, vulkan folders, models in the models folder
- default model converted to gguf: ~1-2 seconds faster on gpu and about 10% faster on cpu, build is 2.7 gb smaller (yay)
- added support for amd cards: app auto-detects the card and uses vulkan instead of cuda
- improved location detection and memory
- updated loras in the win-local build
- rebuilt for mac and amd cards; increased graph sizes
✨ koboldcpp:
- updated to version 1.95.1
- koboldcpp remote: added ngrok support as an alternative to cloudflare
- fixed bots: for detecting outfit changes and location; to generate image prompts
✨ added rounded mplus 1c font as default
✨ added photo studio location
✨ desktop: added button to hide character and text, hotkey shift+h
✨ added grok 4 model (still unstable but interesting in places)
🛠 lora: tweaked system prompt for image generation
🛠 input field improved for easier entry of large multiline text
🛠 adaptive font sizing adjusted: text now fits more often without scrolling
🛠 help section updated
🛠 quick menu now shows only icons
🛠 updated icons for saving images and phone
🛠 settings: tabs now hide when clicking anywhere on the tab bar, not just the icon
🛠 fixed save-image button in nvl mode
🛠 removed gemini recommendation text from main screen
🛠 suggestions (quick replies): now use the last 4 utterances instead of the entire dialogue
🛠 openrouter: fixed error when selecting models from dropdown
🛠 cosmos: added cosmosrp-3.0 and cosmosrp-3.5 models; removed deprecated ones
🛠 anthropic claude: fixed empty “: ” message error with small bots; suggestions (quick replies) now work
🛠 cohere: added suggestions support
🛠 rewritten location detection bot: now triggers immediately after the trigger, not on the next “turn”
→ ngrok plugin (see koboldcpp-remote instructions in the help section)
for low-end machines, i quantized a more heavily compressed q5 model; it should be a bit snappier