undefined

points

[-]

A bit off-topic but I’m on the legacy Lite plan (now discontinued), and it’s more than enough for hobby projects. The main draw is the generous request-based quota (18k requests/month) rather than a token-based one.

This means a 100k token request counts the same as a 100-token one. I’ve made about 8000 requests in the last two weeks, averaging around 80k tokens per request. It feels like they’re subsidizing this just to gather data on agentic workflows.

On the downside, the speed is mediocre (15–30 tg/s for GLM-5), and I’ve seen the model glitch or produce broken output about 10 times out of those 8k requests.