undefined

points

[-]

Even Gemini with no memory does hilarious things. Like, if you ask it how heavy the average man is, you usually get the right answer but occasionally you get a table that says:

- 20-29: 190 pounds

- 30-39: 375 pounds

- 40-49: 750 pounds

- 50-59: 4900 pounds

Yet somehow people believe LLMs are on the cusp of replacing mathematicians, traders, lawyers and what not. At least for code you can write tests, but even then, how are you gonna trust something that can casually make such obvious mistakes?

by drnick15 hours ago|

parent|

[-]

> how are you gonna trust something that can casually make such obvious mistakes?

In many cases, a human can review the content generated, and still save a huge amount of time. LLMs are incredibly good at generating contracts, random business emails, and doing pointless homework for students.

by gf0001 hours ago|

parent|

[-]

And humans are incredibly bad at "skimming through this long text to check for errors", so this is not a happy pairing.

As for the homework, there is obviously a huge category that is pointless. But it should not be that way, and the fundamental idea behind homework is sound and the only way something can be properly learnt is by doing exercises and thinking through it yourself.

by nickjj10 hours ago|

parent|

prev|

[-]

Yeah, ChatGPT's paid version is wildly inaccurate on very important and very basic things. I never got onboard with AI to begin with but nowadays I don't even load it unless I'm really stuck on something programming related.

by dyauspitr12 hours ago|

parent|

prev|

[-]

So what? That might happen one out of 100 times. Even if it’s 1 in 10 who cares? Math is verifiable. You’ve just saved yourself weeks or months of work.

by icedchai11 hours ago|

parent|

[-]

You don't think these errors compound? Generated code has 100's of little decisions. Yes, it "usually" works.

by russfink8 hours ago|

parent|

[-]

LLM’s: sometimes wrong but never in doubt.

by dyauspitr10 hours ago|

parent|

prev|

[-]

Not in my experience. With a proper TDD framework it does better than most programmers at a company who anecdotally have a bug every 2-3 tasks.

by tranceylc7 hours ago|

parent|

[-]

The kind of mistakes it makes are usually strange and inhuman though. Like getting hard parts correct while also getting something fundamental about the same problem wrong. And not in the “easy to miss or type wrong” way.

I wish I had an example for you saved, but happens to me pretty frequently. Not only that but it also usually does testing incorrectly at a fundamental level, or builds tests around incorrect assumptions.

by coldtea6 hours ago|

parent|

prev|

[-]

Yes, just use random results. You’ve just saved yourself weeks or months of work of gathering actual results.