Hacker News
new
past
comments
ask
show
jobs
points
by
jmalicki
12 hours ago
|
comments
by
irthomasthomas
11 hours ago
|
[-]
Here is an example where the prompt was only a few hundred tokens and the output reasoning chain was correct, but the actual function call was wrong
https://x.com/xundecidability/status/2005647216741105962?s=2...
reply
by
jmalicki
11 hours ago
|
parent
|
[-]
I as a human have typos too - and sometimes they're the hardest thing to catch in code review because you know what you meant.
Hopefully there is some of lint process to catch my human hallucinations and typos.
reply