Hacker News
new
past
comments
ask
show
jobs
points
by
isege
6 hours ago
|
comments
by
DaanDL
5 hours ago
|
next
[-]
Okay, but not all results on there are valid, ForgeCode for instance has been cheating in the past:
https://debugml.github.io/cheating-agents/#sneaking-the-answ...
reply
by
cpursley
4 hours ago
|
prev
|
[-]
Those benches are completely and totally meaningless when it comes down to real world work tasks, and everyone knows it.
reply