upvote
> Claude is a phd level mathematician

Unfortunately, it is not, and many of its attempts at mathematical proofs have major flaws. You shouldn't trust its proofs unless you are already able to evaluate them--which I think is pretty much all the OP is saying.

reply
To be fair, many of the proof attempts that mathematicians do also have major flaws. Most get caught before getting published.
reply
But that's the actually important difference. Mathematicians have the toolset and processes to catch the flaws, random people using Claude don't.
reply
Trust isn’t a binary, and I can trust things I don’t understand enough that I can use them. OP was talking about needing to understand, which is quite a bit above the level of being able to validate enough to use for a task.
reply
I definitely wouldn't put math in my code I didn't understand just because Claude says so. I am not astonished that everyone agreed, that's why shit is going to hit the fan pretty badly pretty soon due to AI coding.

There is one exception to this: If the AI also delivers the proof of why the math is correct, in a machine-checked format, and I understand the correctness theorem (not necessarily its proof). Then I would use it without hesitation.

reply
I always found it weird when helping people with excel formulas how few people even try to check maths they don't understand, let alone try to understand it.

I struggle to remember even relatively simple maths like working out "what percentage of X is Y" so if I write a formula like that I'll put in some simple values like 12 and 6 or 10,000 and 2,456 just to confirm I haven't got the values backwards or something. I've been shown sheets where someone put a formula in that they don't understand, checked it with numbers they can't easily eyeball and just assumed it was right as it's roughly in their ball park / they had no idea what the end result should be.

Then again I've also seen sheets where a 10% discount column always had a larger number than the standard price so even obviously wrong things aren't always checked.

reply
I don't disagree, but whoever never put math they don't fully understand in their code gets to throw the first stone.

I've reached solutions by trial and error too, and tried to rationalize them later, quite a few times. And it's easier to rationalize a working solution, however adversarial you claim to be in your rationalization.

I don't see using gen AI for the (not so) “brute force” exploration of the solution space as that different from trial and error and post fact rationalization.

reply
How did you test that the solution is correct? Is the set of possible inputs a low-ish finite number?

Normally with mathematical problems you have to prove the solution correct. Testing is not sufficient, unless you can test all possible inputs exhaustively.

reply
How do you know what it spat out is correct though?

If it’s beyond our ability to review and we blindly trust it’s correct based on a limited set of tests… we’re asking for trouble.

reply
> Claude is a phd level mathematician , I am not

I’m going to guess that this is Gell-Mann amnesia more than anything, and it’s going to get a lot of organizations into a lot of weird places.

reply
> Claude is a phd level mathematician

... that can't even count.

reply
You do realize you can ask Claude about the things you don't understand?

"PhD level" just means you finished a bachelor and masters degree and are now doing a bit of original research as an employed research assistant.

Claude isn't "PhD level" anything. This shows a complete lack of understanding here. Claude has read every single text book in existence, so it can surface knowledge locked away in book chapters that people haven't read in years (nobody really reads those dense books on niche topics from start to finish).

Since Claude has infinite patience, you can just keep asking until you get it.

reply