Are you getting anything besides gibberish out of it? I tried their recommended commandline and it's dog slow even though I built their llama.cpp fork with AVX2 enabled. This is what I get:
$ ./build/bin/llama-cli -hf prism-ml/Bonsai-8B-gguf -p "Explain quantum computing in simple terms." -n 256 --temp 0.5 --top-p 0.85 --top-k 20 -ngl 99
> Explain quantum computing in simple terms.
\( ,
None ( no for the. (,./. all.2... the ..... by/
EDIT: It runs fine in their collab notebook. Looking at that you have to do: git checkout prism (in the llama.cpp repo) before you build. That's a missing instruction if you're going straight to their fork of llama.cpp. Works fine now.