Hacker News
new
past
comments
ask
show
jobs
points
by
anilgulecha
3 hours ago
|
comments
by
kostaj
3 hours ago
|
[-]
Will add a human-labelled expected response and measure against it in a follow up research. This one only captures the disagreement between the models, but not which model is write/wrong.
reply