Hacker News
new
past
comments
ask
show
jobs
points
by
wwind123
13 hours ago
|
comments
by
ben_w
12 hours ago
|
[-]
We could call this "reinforcement learning from human feedback" (RLHF) :)
https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...
reply