upvote
Agreed, I use Z.ai and the usage is fantastic the only temper that recommendation that it's often unreliable. Perhaps a few times per week it's unresponsive. Maybe more often it seems to become flakey.

It's very variable though recently I'm noticing it's more reliable but there was a patch where it was nearly unusable some days.

I guess I won't complain for the price and YMMV.

reply
Agreed. They had a rough patch around the 4.7 to 5 upgrade. New architecture required hardware migration. The 5 to 5.1 upgrade was much smoother (same architecture new weights). As you say, little rough around edges, but still great value. Trick I learned is that it's max 2 parallel requests per user. You can put a billion tokens a month through it, but need to manage your parallelism.
reply
If you're ok with a model provider that goes down all the time and has such a poor inference engine setup that once you get past 50k tokens you're going to get stuck in endless reasoning loops.
reply