Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines
(+1)

1.0.7 already utilized all of those fixes, including the ones talked about here: https://github.com/ggml-org/llama.cpp/issues/12946

Generation works until a certain total amount of tokens have been processed, then it starts producing broken outputs regardless of the size of the inputs. It also produces CUDA errors that crash the backend randomly.