undefined

points

[-]

It depends. If they're using a small/medium local model as a 1:1 ChatGPT replacement as-is, they'll have a bad time. Even ChatGPT refers to external services to get more data.

But a local model + good harness with a robust toolset will work for people more often than not.

The model itself doesn't need to know who was the president of Zambia in 1968, because it has a tool it can use to check it from Wikipedia.

by ZeroGravitas13 hours ago|

parent|

[-]

You can install the complete text of Wikipedia locally too.

They've usually been intended for ereader/off-grid/post-zombie-apocalypse situations but I'd guess someone is working on an llm friendly way to install it already.

Be interesting to know the tradeoffs. The Tienammen square example suggests why you'd maybe want the knowledge facts to come from a separate source.

by zozbot23412 hours ago|

parent|

[-]

The Wikipedia folks are now working on implementing a language-independent representation for their encyclopedic content - one that's intended to be rigorously compositional and semantics-aware, loosely comparable to Universal Meaning Representation (UMR) as known in the linguistics domain, that - if successful - may end up interacting in very interesting ways with multi-language capable LLMs. Very early experiments (nowhere near as capable as UMR as of yet, but experimenting with the underlying software infrastructure) are at https://abstract.wikipedia.org , whilst a direct comparison of the projected design is given by https://commons.wikimedia.org/wiki/File:Abstract_Wikipedia_N... https://elemwala.toolforge.org/static/nlgsig-nov2025.html

by selcuka16 hours ago|

prev|

[-]

Any citations? Because that was my impression, too. I want frontier model performance for my coding assistant, but "most users" could do with smaller/faster models.

ChatGPT free falls back to GPT-5.2 Mini after a few interactions.

by lxgr14 hours ago|

parent|

[-]

Have you used GPT instant or mini yourself? I think it’s pretty cynical to assume that this is “good enough for most people”, even if they don’t know the difference between that and better models.

by throwaway2744813 hours ago|

parent|

[-]

Say more. Why do you think this?

by embedding-shape11 hours ago|

parent|

[-]

They're awful and hallucinate a lot, I couldn't imagine using it even for prompts about TV shows, even less so for serious work. Repeating the question from the parent, have you tried those yourself? Even compared to ChatGPT Thinking, they're short of useless.

by lxgr8 hours ago|

parent|

prev|

[-]

They're essentially replying based on vibes, instead of grounding their responses in extensive web searches, which is what the paid models/configurations generally do. This makes them wrong more often than they're right for anything but the most trivial requests that can be easily responded to out of memorized training data.

This is all on top of the (to me) insufferable tone of the non-thinking models, but that might well be how most users prefer to be talked to, and whether that's how these models should accordingly talk is a much more nuanced question.

Regardless of that, everybody deserves correct answers, even users on the free tier. If this makes the free tier uneconomical to serve for hours on end per user per day, then I'd much rather they limit the number of turns than dial down the quality like that.

by asutekku15 hours ago|

parent|

prev|

[-]

Frontier model has much better knowledge and they usually hallucinate less. It's not about the coding capabilities, it's about how much you can trust the model.

by Barbing15 hours ago|

parent|

[-]

re: trust-

Have you tried the free version of ChatGPT? It is positively appalling. It’s like GPT 3.5 but prompted to write three times as much as necessary to seem useful. I wonder how many people have embarrassed themselves, lost their jobs, and been critically misinformed. All easy with state-of-the-art models but seemingly a guarantee with the bottom sub-slop tier.

Is the average person just talking to it about their day or something?

by PhilipRoman9 hours ago|

parent|

[-]

I use the free version of ChatGPT (without logging in) when I need some one-off question without a huge context. Real world prompt:

  "when hostapd initializes 80211 iface over nl80211, what attributes correspond to selected standard version like ax or be?"

It works fine, avoids falling into trap due to misleading question. Probably works even better for more popular technologies. Yeah, it has higher failure rates but it's not a dealbreaker for non-autonomous use cases.

by theshrike7913 hours ago|

parent|

prev|

[-]

Even the paid version of ChatGPT tends to use a 1000 words when 10 will do.

You can try asking it the same question as Claude and compare the answers. I can guarantee you that the ChatGPT answer won't fit on a single screen on a 32" 4k monitor.

Claude's will.

by throwaway2744813 hours ago|

parent|

prev|

[-]

If someone blindly submits chatbot output they deserve to be embarrassed and fired. But I don't think that's going to improve.

by jychang14 hours ago|

parent|

prev|

[-]

The free version of ChatGPT is insanely crippled, so that's not surprising.

by helsinkiandrew14 hours ago|

prev|

[-]

> unfortunately, this is not the case

Most users are fixing grammar/spelling, summarising/converting/rewriting text, creating funny icons, and looking up simple facts, this is all far from frontier model performance.

I've a feeling that if/when Apple release their onboard LLM/Siri improvements that can call out if needed, the vast majority of people will be happy with what they get for free that's running on their phone.

by drob51810 hours ago|

parent|

[-]

“You are the smartest high school student that has ever lived and on the college track to Harvard or another Ivy League school. Write a 10 page history term paper about Tiananmen Square and the specific events that took place there. Include a bibliography and use footnotes to cite sources.”

by 13 hours ago|

prev|

[-]

deleted

by blitzar13 hours ago|

prev|

[-]

"Hey dingus, set timer for 30 minutes"

by cyanydeez11 hours ago|

prev|

[-]

eh, its weird how thetech world wants to build trillions of data centers for...what, escapingthe permanent underclass?