For Nemo, it should take no more than 2-3 seconds on a 5070 ti, but I actually worked on porting the game over to Linux today (which also uses a linux native backend for the LLM), so try the native build.
Thanks for working on a native version! Unfortunately, it fails on loading the model. Maybe it's trying to load koboldcpp.exe rather than koboldcpp-linux? Or it could be that you're using windows-based variables for the command line. I can't see any error so I checked diagnostic.bat but that doesn't exactly work.
Instead, I edited diagnostic.bat to turn it into a linux shell file, the contents of which I've pasted below. Linux shell scripts don't have anything like a goto, really, so I just put the questions into while true loops. This does load koboldcpp so I think it's related to the command line arguments you're using.
#! /bin/bash
echo "Welcome to the diagnostic tool. This tool will launch Nemo via koboldcpp-linux with the same arguments as the game, while letting you view the error message in case of a crash."
#GPU_TYPE
while true
do
read -p "Are you using an NVIDIA or AMD GPU? (Type \"NVIDIA\" or \"AMD\"): " GPU_TYPE
if [ $GPU_TYPE == "NVIDIA" -o $GPU_TYPE == "AMD" ]
then
break
else
echo "Invalid input. Please type \"NVIDIA\" or \"AMD\"."
fi
done
#VRAM_QUESTION
while true
do
read -p "Do you have 6, 8, or more than 10 GB of VRAM? (Type \"6\", \"8\", or \"10+\"): " VRAM
if [ $VRAM == "6" ]
then
GPULAYERS=17
break
elif [ $VRAM == "8" ]
then
GPULAYERS=27
break
elif [ $VRAM == "10+" ]
then
GPULAYERS=43
break
else
echo "Invalid input. Please type \"6\", \"8\", or \"10+\"."
fi
done
#LAUNCH_KOBOLDCPP
echo "Launching KoboldCPP with the specified settings..."
if [ $GPU_TYPE == "NVIDIA" ]
then
GPU_ARG="--usecublas"
elif [ $GPU_TYPE == "AMD" ]
then
GPU_ARG="--usevulkan"
fi
./koboldcpp-linux --model "Nemo.gguf" $GPU_ARG --gpulayers $GPULAYERS --quiet --multiuser 100 --contextsize 4096 --skiplauncher