upvote
Not true at all with frontier models in last ~6 months or so. The frontier models today produce code better than 90% of junior to mid-level human developers.
reply
You say that, but it's been better than most employees for a year or so ( *for specific tasks, of course. It's still not better than "an employee" )
reply
Just like a real employee!
reply
And just like a real employee, this makes it work worse.

(Old study, I wonder if it holds up on newer models? https://arxiv.org/pdf/2402.14531)

reply
Interesting, I've actually found swearing at the dumbass bots to give better results, might just be the catharsis of telling it it's a dumbass though.
reply