upvote
Indeed, at 30tok/s make it pause for 20 seconds while "thinking" is streaming (and hidden); that's the real experience.
reply
deleted
reply
You should check out https://tokey.ai, I made it a few months ago and has all of these suggestions.
reply
Yes, it should use actual output from some of the open models.
reply