Hacker News
new
past
comments
ask
show
jobs
points
by
SubiculumCode
5 hours ago
|
comments
by
fridder
5 hours ago
|
[-]
Yeah. I usually do this by telling it to be adversarial and find gaps and holes. Not fool proof but it does seem to increase the quality. It has helped when using local models in particular.
reply
by
SubiculumCode
4 hours ago
|
parent
|
[-]
Yeah, you have to shortcut the RL-trained people pleasing
reply