undefined

points

[-]

z.ai coding plan is a fairly decent deal at ~$16/mon USD considering it's supposed to have a fair bit more usage than the comparable $20/mon Claude plan. On the other hand, z.ai seems a bit on the slower side for raw model tok/sec throughput.

by chpatrick9 hours ago|

prev|

[-]

It's pricing is a lot cheaper if you can run it yourself.

by nijave4 hours ago|

parent|

[-]

Not this one. It's a SOTA-class model >800Gi VRAM required at fp8

by jeremyjh8 hours ago|

prev|

[-]

What?

It is less than 20% of the cost of Opus at API rates. 1.40/4.40 vs 5/25.

by cmrdporcupine7 hours ago|

parent|

[-]

Not when you factory in token efficiency. It burns a lot more tokens to do the same job, so when I compared to GPT5.5 I was frankly not really much ahead, and with weaker thinking.

Maybe makes sense if you have z.AI's (not greatly priced) subscription plan, but it's not competitive against an OpenAI or Anthropic monthly coding subscription plan. I burned through almost $10 worth of tokens just doing an hour of work.

by Sanzig5 hours ago|

parent|

[-]

Take a look at Ollama Cloud: https://ollama.com/pricing

You get access to a whole bunch of bleeding edge open models including GLM-5.2, Kimi K2.7, DeepSeek 4 Pro, etc. Inference is run on US/SG/EU cloud providers with zero data retention policies. The $20/mo tier is very generous, in my experience.

by jeremyjh2 hours ago|

parent|

[-]

They don’t have a statement about where it is run or data retention on the GLM5.2 model. They do state that for others, like MiniMax.

by Sanzig1 hours ago|

parent|

[-]

There's a blanket statement at the bottom of the pricing page, which I would hope also applies to GLM-5.2:

> Where are models hosted?

> Ollama hosts models and compute resources primarily in the United States. To serve global demand, we may route to Europe and Singapore for additional capacity.

> Is my prompt or response data trained on?

> Prompt or response data is never logged or trained on.

> Who does Ollama partner with to host models?

> Ollama collaborates with NVIDIA Cloud Providers (NCPs) to host open models.

> When Ollama partners with providers, we require no logging, no training, and zero data retention policies in place.