It is less than 20% of the cost of Opus at API rates. 1.40/4.40 vs 5/25.
Maybe makes sense if you have z.AI's (not greatly priced) subscription plan, but it's not competitive against an OpenAI or Anthropic monthly coding subscription plan. I burned through almost $10 worth of tokens just doing an hour of work.
You get access to a whole bunch of bleeding edge open models including GLM-5.2, Kimi K2.7, DeepSeek 4 Pro, etc. Inference is run on US/SG/EU cloud providers with zero data retention policies. The $20/mo tier is very generous, in my experience.
> Where are models hosted?
> Ollama hosts models and compute resources primarily in the United States. To serve global demand, we may route to Europe and Singapore for additional capacity.
> Is my prompt or response data trained on?
> Prompt or response data is never logged or trained on.
> Who does Ollama partner with to host models?
> Ollama collaborates with NVIDIA Cloud Providers (NCPs) to host open models.
> When Ollama partners with providers, we require no logging, no training, and zero data retention policies in place.