Question about performance before I buy this. I have a 5070ti, 16GB vram, (also a 7950x and 96GB dram), should be plenty to run the normal 8GB model. But in the demo at least it takes a little under a minute to locally gen per conversation reply, and that's if nothing else happens like moving around or giving items. If it's trying to do anything like that then it's much longer. Is that normal you think, or might it be because I'm running it on Linux and it's losing performance somewhere? I could get used to it, but it's not the most fun thing to wait around.
Thanks for working on a native version! Unfortunately, it fails on loading the model. Maybe it's trying to load koboldcpp.exe rather than koboldcpp-linux? Or it could be that you're using windows-based variables for the command line. I can't see any error so I checked diagnostic.bat but that doesn't exactly work.
Instead, I edited diagnostic.bat to turn it into a linux shell file, the contents of which I've pasted below. Linux shell scripts don't have anything like a goto, really, so I just put the questions into while true loops. This does load koboldcpp so I think it's related to the command line arguments you're using.
#! /bin/bash
echo "Welcome to the diagnostic tool. This tool will launch Nemo via koboldcpp-linux with the same arguments as the game, while letting you view the error message in case of a crash."
#GPU_TYPE
while true
do
read -p "Are you using an NVIDIA or AMD GPU? (Type \"NVIDIA\" or \"AMD\"): " GPU_TYPE
if [ $GPU_TYPE == "NVIDIA" -o $GPU_TYPE == "AMD" ]
then
break
else
echo "Invalid input. Please type \"NVIDIA\" or \"AMD\"."
fi
done
#VRAM_QUESTION
while true
do
read -p "Do you have 6, 8, or more than 10 GB of VRAM? (Type \"6\", \"8\", or \"10+\"): " VRAM
if [ $VRAM == "6" ]
then
GPULAYERS=17
break
elif [ $VRAM == "8" ]
then
GPULAYERS=27
break
elif [ $VRAM == "10+" ]
then
GPULAYERS=43
break
else
echo "Invalid input. Please type \"6\", \"8\", or \"10+\"."
fi
done
#LAUNCH_KOBOLDCPP
echo "Launching KoboldCPP with the specified settings..."
if [ $GPU_TYPE == "NVIDIA" ]
then
GPU_ARG="--usecublas"
elif [ $GPU_TYPE == "AMD" ]
then
GPU_ARG="--usevulkan"
fi
./koboldcpp-linux --model "Nemo.gguf" $GPU_ARG --gpulayers $GPULAYERS --quiet --multiuser 100 --contextsize 4096 --skiplauncher