upvote
> they don't have a used by date

For quite a lot of use cases, the current systems arguably do get worse over time if not continually updated. The knowledge cutoff date will start to hurt more and more as the weights age in a hypothetical scenario where you are stuck with them forever.

Coding, one of the most popular usescases today, would not be great if it say only understood java to a version from years ago etc.

https://en.wikipedia.org/wiki/Knowledge_cutoff

reply
One solution is not to advance anything of course. I'm not even joking, is there going to be a successor to React? I suspect not, with the vast amount of training data for React now, it's going to look silly to move to something else with less support. What is the last new popular programming language, rust? Will there be another one? I suspect not. Same reasoning. The irony of all this AI acceleration talk is it'll work best if we don't accelerate the underlying tech at all.
reply
There probably won't be new stuff so much as trends in how stuff is done, and updates around optimizing those trends.
reply
Will programming languages evolve into less human oriented written code and more just calls to a trusted AI.

Or will human readable code be less and less of a thing as AI learns it's own, more terse language to talk to other AI's.

reply
Yes. I am seeing a big push to use vanilla js for single file html apps that are easy to build, deploy and distribute because they have no build step. I could see component libraries emerging that make it easier build from chat interfaces with less ceremony
reply
i'm not sure the tradeoff in code readability is worth it as of now.
reply
Name/post content combo on point
reply
Alot of the language work is scratching the itch of engineers and developers. I think you’re correct and react is the new COBOL.
reply
Humans are notoriously bad at predicting the future. Toward that end, your prediction is laughable. React is the end all be all of UI… lol
reply
Programmers won't be allow to exist in future. Vibe coding is the final resolution people can apply.
reply
Nobody is unaware of the knowledge cutoff, and sharing the Wikipedia article is not helping anyone. Your point is easily rebutted by taking whatever open weights/source model has an outdated cutoff and training or fine tuning it on more data, which is again always going to be viable given a modicum of compute
reply
You could learn how to code...a whole generation did it before...
reply
>Coding, one of the most popular uses cases today, would not be great if it say only understood java to a version from years ago etc.

This LLM trained only and entirely on pre-1930s texts was able to code Python programs when given only a short example:

https://talkie-lm.com/introducing-talkie

reply
I genuinely don't understand how can this possibly be a problem long term.

It feels very obvious that the solution is to have a smaller model that can be trained exclusively on Java information to augment the older model. If the architecture doesn't support it currently, then that's what the architecture will look like in the future.

Otherwise you'd be arguing that, to serve users who want to an up-to-date LLM on topic X, you have to train the model on the entire ABC all over again.

It's simply ludicrous to have a coding LLM that needs to be retrained on the latest published poems and pastry recipes to generate Java.

reply
Small models are more useful for "doing stuff" than "knowing stuff" to begin with. Add in an agentic harness and a small model can happily read more current information on demand (including from e.g. a local wikipedia snapshot).
reply
Ha yes I used to think this was not a notable issue, but just today I was getting qwen 3.5 to fix my network drivers and it immediately freaked out like: "kernel 6.17, what the fuck? that doesn't exist yet!". It almost had a mental breakdown over that detail and derailed the conversation towards checking what's wrong with the kernel version reporting lol.
reply
FOMO. A new model comes out weekly and the HN crowd debates over the minutia of changes.

Pockets are too deep, it will only change once everyone is out of money.

reply
What is really amusing to me is how N months ago, the latest SOTA was incredible, but now utterly unusable. Feels like there is a model reality-distortion field in play where people can only acknowledge the flaws in retrospect.
reply
They’re really not good enough, unless you consider 64 GB of memory or more consumer grade.
reply
I’m pretty happy with what a 32GB Mac Studio can do for a lot of tasks. They’re the things I’d throw a model like Haiku at, but still genuinely useful. We don’t have an answer to frontier models in the consumer range yet, but we’re not totally trapped.

Side note though, it’s the speed that bothers me more than the reasoning. Qwen 3.5 is awesome, but my Claude subscription can tear through similar workloads an order of magnitude faster than my local LLM can when using Haiku. That’ll matter a lot to some people.

reply
Yeah this is the real killer. slower and more expensive is tough.
reply
> They're good enough for 95% of use cases

They're not at all, not even close. Especially when you consider the use cases for people who are paying for LLM services today.

reply
Hardware. Frontier labs are driving up demand so much that it's priced significantly above cost making it far less affordable. Just look at Nvidia's profit margins.
reply
The use cases in the future will be nothing like the use cases from today.
reply
Maybe. The use cases people primarily use LLMs for (documents, coding, design, research) existed decades ago with different tooling. Who knows if the future will have a slew of new problems that require new models or will continue to be similar?
reply
> What stops you from running the best open weighted LLMs currently available on consumer grade hardware for the rest of time?

Uh… the hardware requirements? And stop acting like some dog shit 8B model the average Joe can run on a laptop is even close to being comparable to what Claude or even Codex can currently do.

I have pretty good hardware and I’ve tinkered with the best sub-150B models you can use and they are awful compared to Anthropic/OAI/Grok.

reply
What if the harness and loops get sufficiently better though? CC is using haiku for code-base gripping and such, you don't see a local commodity model being "good enough" for the 80% case when matched with better harnesses and tool calls?

honest question, i'm very interested in this, but too casual as of now to know any better.

reply
vast majority of average users don't use llms for coding, and for those purposes, local llms with low param count are a far cry from SOTA models.
reply
> And stop acting like some dog shit 8B model the average Joe can run on a laptop is even close to being comparable to what Claude or even Codex can currently do.

I'm not, you've actually illustrated my point. LLMs in 2022 were very impressive. By 2024 the general public was finding them an acceptable replacement for many research driven tasks and massive shortcuts for other tasks (coding, image work, document preperation, etc).

Those models are absolutely runnable on consumer hardware now, and we were extremely happy with the results. It's no different to how we used to think CRTs were amazing or early smartphones, but going back now they seem awful.

We're long past "danger". If what we have is the best we'll ever have open source, we're already in an excellent position.

reply
> LLMs in 2022 were very impressive.

No they weren't. They were a gimmick - it is only in the past 6 or so months that frontier models have started to do stuff beyond mere gimmicks when it comes to coding, and you could make the argument that Mythos has been the first 'Holy shit' moment that we've had that has stepped us beyond 'Yeah that's really neat but...'

> Those models are absolutely runnable on consumer hardware now,

A sub 50B model is awful and can't even write proper English sentences half the time, to say nothing of how bad its world knowledge is. Try the 32B Gemma 4 local model for a week and then go back to Claude and then get back to me.

> We're long past "danger". If what we have is the best we'll ever have open source, we're already in an excellent position.

Not sure what to tell you other than that you and I have very different standards. What we have locally right now is barely more than a glorified autocomplete, and it feels worse than using ChatGPT 2 years ago because the context window is less and it doesn't have good webhooks on consumer setups. Another thing I'd say is that you clearly have no clue what 'consumer hardware' means, or what consumers that can even get this stuff running locally would have to do to get it to even rival the frontier models in terms of their usability (most consumers are't going to just boot into Ubuntu and run this thing from a command line) flow, to say nothing of the hardware requirements. I'd love to never use Claude or Gemini or ChatGPT again for both privacy and money reasons, but the quality of outputs and depth of thinking and writing ability between even the very best local models you can run right now is many orders of magnitude less than what you get using distributed frontier models, and those 'very best' local models require a top of the line machine that 99.9999% of consumers don't have and would never consider buying. The cloud models all have like a trillion(!) parameters now. It isn't even close.

I sure hope the local side of things massively improves over the next 2-3 years, but based on how this has gone my guess is that in 3 years you'll be lucky, if you have very top of the line hardware, to get benchmark performance that we had 6 months ago with the frontier models. The distributed hardware/memory gap is just too big.

reply
95% of usecases. What are you smoking.
reply
There are very good open weight models (such as DeepSeek v4 Flash) that can run on consumer level hardware.

Note that we are talking about 95% of everyone's use cases, not your specific use cases (which could require better models all the time).

reply
deleted
reply