undefined

points

[-]

Do you find the results vary based on whether it uses RAG to hit the internet vs the data being in the weights itself? I'm not sure I've really noticed a difference, but I don't often prompt about current events or anything.

by reconnecting2 hours ago|

parent|

[-]

I noticed that many recent technologies are not familiar to LLMs because of the knowledge cutoff, and thus might not appear in recommendations even if they better match the request.

by dpoloncsak1 hours ago|

parent|

[-]

Oh thats a good point, yeah.

If I told it I'm shopping for a budget-level Mac, it may not recommend the Neo. I'm sure software only moves faster, too. Especially as more code is 'written' blindly, new stacks may never see adoption

by 2 hours ago|

prev|

[-]

deleted

by zild3d2 hours ago|

prev|

[-]

whats surprising about that? most of the minor version updates from all the labs are post training updates / not changing knowledge cutoff

by reconnecting2 hours ago|

parent|

[-]

Thanks for letting me know, I will be waiting for the major update.

by F7F7F72 hours ago|

parent|

[-]

It's been like this since GPT 3.5. This is not a limitation and is generally considered a natural outcome of the process.

So there's no major update in the sense that you might be thinking. Most of the time there's not even an announcement when/if training cut offs are updated. It's just another byline.

A 6 month lag seems to be the standard across the frontier models.

by reconnecting2 hours ago|

parent|

[-]

I've actually started worrying that the amount of false data produced with LLMs on the public internet might provoke a situation where the knowledge cutoff becomes permanently (and silently) frozen. Like we can't trust data after 2025 because it will poison training data at scale, and models will only cover major events without capturing the finer details.

by gwern1 hours ago|

parent|

[-]

I agree. That's why you should write as much as you can now, if you want to get it into the LLMs (https://gwern.net/blog/2024/writing-online). You never know when the window will slam shut and LLM training goes 'hermetic' as they focus on 'civilization in a datacenter' where only extremely vetted whitelisted data gets included in the 'seed' and everything is reconstructed from scratch for the training value & safety.