upvote
Have you used GPT instant or mini yourself? I think it’s pretty cynical to assume that this is “good enough for most people”, even if they don’t know the difference between that and better models.
reply
Say more. Why do you think this?
reply
They're awful and hallucinate a lot, I couldn't imagine using it even for prompts about TV shows, even less so for serious work. Repeating the question from the parent, have you tried those yourself? Even compared to ChatGPT Thinking, they're short of useless.
reply
They're essentially replying based on vibes, instead of grounding their responses in extensive web searches, which is what the paid models/configurations generally do. This makes them wrong more often than they're right for anything but the most trivial requests that can be easily responded to out of memorized training data.

This is all on top of the (to me) insufferable tone of the non-thinking models, but that might well be how most users prefer to be talked to, and whether that's how these models should accordingly talk is a much more nuanced question.

Regardless of that, everybody deserves correct answers, even users on the free tier. If this makes the free tier uneconomical to serve for hours on end per user per day, then I'd much rather they limit the number of turns than dial down the quality like that.

reply
Frontier model has much better knowledge and they usually hallucinate less. It's not about the coding capabilities, it's about how much you can trust the model.
reply
re: trust-

Have you tried the free version of ChatGPT? It is positively appalling. It’s like GPT 3.5 but prompted to write three times as much as necessary to seem useful. I wonder how many people have embarrassed themselves, lost their jobs, and been critically misinformed. All easy with state-of-the-art models but seemingly a guarantee with the bottom sub-slop tier.

Is the average person just talking to it about their day or something?

reply
I use the free version of ChatGPT (without logging in) when I need some one-off question without a huge context. Real world prompt:

  "when hostapd initializes 80211 iface over nl80211, what attributes correspond to selected standard version like ax or be?"
It works fine, avoids falling into trap due to misleading question. Probably works even better for more popular technologies. Yeah, it has higher failure rates but it's not a dealbreaker for non-autonomous use cases.
reply
Even the paid version of ChatGPT tends to use a 1000 words when 10 will do.

You can try asking it the same question as Claude and compare the answers. I can guarantee you that the ChatGPT answer won't fit on a single screen on a 32" 4k monitor.

Claude's will.

reply
If someone blindly submits chatbot output they deserve to be embarrassed and fired. But I don't think that's going to improve.
reply
The free version of ChatGPT is insanely crippled, so that's not surprising.
reply