undefined

points

by vunderba5 hours ago |

comments

by misiti37802 hours ago|

[-]

what HW are you running them on ? are you using OLLAMA ?

by vunderba1 hours ago|

parent|

[-]

I'm using the default llama-server that is part of Gerganov's LLM inference system running on a headless machine with an nVidia 16GB GPU, but Ollama's a bit easier to ease into since they have a preset model library.

https://github.com/ggml-org/llama.cpp