Hacker News
new
past
comments
ask
show
jobs
points
by
toephu2
18 hours ago
|
comments
by
ejpir
17 hours ago
|
[-]
those are not verified. I've tried forgecode and I cannot believe they didn't do something to influence the benchmarks
reply
by
GodelNumbering
16 hours ago
|
parent
|
[-]
Yup, they were found to be sneaking the answer key using agents.md
https://debugml.github.io/cheating-agents/#sneaking-the-answ...
reply