(eg: "Cite?")
RLHF = Reinforcement Learning from Human Feedback
https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...
I'm still waiting for models based on the curt and abrasive stereotype of Eastern European programmers, as contrast to the sickeningly cheerful AIs we have today that couldn't sound more West Coast if they tried.
The syncopath style is clearly categorized as more liberal (do what you feel is good).
RLHF is "ask a human to score lots of LLM answers". So the claim is that the AI companies are hiring cheap (~poor) people from convenient locations (CA, since that's where the rest of the company is).
If you adjust your mindset slightly when searching online, it's not hard to find communities of people looking for quick side work and this was huge during the covid lockdown era. There were people helping train LLMs for all kinds of purposes from education to customer service. Those startups quickly cashed out a few years ago and sold to the big players we have now.
I don't get why this is hard for people to believe (or remember)?
This sounds like something Elon would say to make Grok seem "totally more amazeballs," except "anti-woke" Grok suffers from the same behavior