One thing I haven't figured out: Subjectively, it feels like ollama's model loading was nearly instant, while I feel like I'm always waiting for llama.cpp to load models, but that doesn't make sense because it's ultimately the same software. Maybe I should try ollama again to convince myself that I'm not crazy and that ollama's model loading wasn't actually instant.