Hacker News
new
past
comments
ask
show
jobs
points
by
suzzer99
12 hours ago
|
comments
by
hackernewds
2 hours ago
|
next
[-]
One would believe a model scoring this high on SWEBench could maximize F1 score for a precision recall problem easily. What's the missing part?
reply
by
killingtime74
12 hours ago
|
prev
|
[-]
In this case, being distilled is sort of existential to them. The false positives would just be losing some revenue (depending if profitable, not even losing profit).
reply