I did chuckle at the 100% Rust Linux kernel. I like Rust, but that felt like a clever joke by the AI.
You know what I'd really like, that would justify a version bump? CRDT. Automatically syncing local changes to a remote service, so e.g. an Android app could store data locally on SQLite, but also log into a web site on his desktop and all the data is right there. The remote service need not be SQLite - in fact I'd prefer postgres. The service would also have to merge databases from all users into a single database... Or should I actually use postgres for authorisation but open each users' data in a replicated SQLite file? This is such a common issue, I'm surprised there isn't a canonical solution yet.
Even a product that does this behind the scenes, by wrapping SQLite and exposing SQLite's wrapped interface, would be great. I'd pay for that.
Usually my memory regarding such things is quite well, but this one I keep forgetting, so much so that I don't remember what the issue is actually about xD
Beautifully self-serving while being a benefit to others.
Same thing with picking nails up in the road to prevent my/everyone’s flat tire.
See other comment where OP shared the prompt. They included a current copy of the front page for context. So it’s not so surprising that ziggy42 for example is in the generated page.
And for other usernames that are real but not currently on the home page, the LLM definitely has plenty occurrences of HN comments and stories in its training data so it’s not really surprising that it is able to include real usernames of people that post a lot. Their names will be occurring over and over in the training data.
edit: It looks like it probably is a thing given it does sometimes output names like that. So the pattern is probably just too rare in the training data that the LLM almost always prefers to use actual separators like underscore.
lower|case|un|se|parated|name- IBM to acquire OpenAI (Rumor) (bloomberg.com)
- Jepsen: NATS 4.2 (Still losing messages?) (jepsen.io)
- AI progress is stalling. Human equivalence was a mirage (garymarcus.com)
(Especially in datasets before this year?)
I’d bet half or more - but I’m not checking.
The thing is, most of the models were heavily post-trained to limit this...