There are journalists being hired to write Atlantic-worthy articles that exist only as LLM training data, because they're getting paid more than the Atlantic would pay them for it.
It's insane.
Yes, they are hiring the experts themselves. To create new knowledge above and beyond what's on the internet. To be locked away as LLM training data.
The largest characteristic of all of this new data is it is targeted at LLM's weak points.
It's not just more data, it's custom tutorials built for what LLMs struggle at.
1) Identify the gaps
2) Determine how to fix them
3) Implement a fix (especially if that fix is: identify and find experts)
4) And judge the result
How do they know [person] is an expert in [some field]? How do they find that person? How many experts are necessary to give the right information? How do we evaluate the results, especially if it's novel?
You can find a lot of people who disagree on many topics, and those turtles go all the way down.
I'm not in disagreement that your work will help reduce hallucinations and improve model performance! It is.
I predict (I hope I'm wrong!) that we're going to hit some asymptote that is not at 0% hallucinations (and I would even put a substantial nonzero probability that "overall" hallucination rate bottoms out at some minimum and then slowly grows because we just can't keep up with the new garbage we throw at it).
You just stumbled upon billion dollar businesses: Mercor, micro1, Scale AI, Surge AI, etc
They have a PhD from a top school, they are a licensed attorney, they are a licensed physician, a board certified cardiologist, etc.
They are constantly recruiting from these populations with well-paying side gigs.
> 4) And judge the result
That's what they pay the experts for. And to have experts review the other experts with peer review.
> You can find a lot of people who disagree on many topics, and those turtles go all the way down.
Which is why everything has to be well-calibrated and not just a hot take - a well reasoned opinion any expert would find fair.
Noone is really caring about hallucinations on point facts these days though, it is much more about complex reasoning tasks. Can they move the bar on the complexity of software LLMs do on their own? Can they get to a point where LLMs can begin to replace physicians? Financial advisors? Actuaries? etc.
I wonder if extracting those static reasoning chains make sense given a Rich Sutton's "The Bitter Lesson" and Geoffrey Hinton's "People should stop training radiologists now.". I guess until participants make money they won't stop, not sure if they do, so far it is more about expectation of profitability as I understand.
Given exposure to enough reasoning chains, with training data that is designed around adversarial reasoning and teaching models to reason, these types of training data might be key to teaching models to reason beyond what they could gather from static data.
The boundary is pretty thin there though. E.g., Gemini recently told me that a certain papers claims that two frameworks are mathematically equivalent, while the paper shows the opposite, and yesterday Google's AI overview told me that no World Cup matches were scheduled for that day despite their being several of them. The model probably used complex reasoning to arrive at both (incorrect) answers, but superficially they look like basic errors of fact.
You write the prompt, and then write rubrics to judge the responses, and you found something the model failed at. Congratulations, you just earned $500, now do it again.
But be careful: they are watching you and they don't want you giving away their secrets!
2. What criteria do such vendors typically require?
"As a side gig, I write novel software that solves problems no existing software does,"
and
"Yes, they are hiring the experts themselves. To create new knowledge above and beyond what's on the internet. To be locked away as LLM training data."
More likely you're joking and/or paranoid!8-))
This is actually really easy to do if you step out of web/gui/crud and into something where you won't find public code, most ever, because it's trade secret. For example, manufacturing.
Anyone writing software for long enough has a long list of these things in the back of their head that are great fodder for LLM training data.