Also you need to check your context size, Ollama default to 4K if <24 Gb of VRAM and you need 64K minimum if you want claude to be able to at least lift a finger.
It's incomparably faster than any other model (i.e. it's actually usable without cope). Caching makes a huge difference.