undefined

points

[-]

Of course, once LLMs are really good at that, they can be set loose on the entire historical math literature, all 3.5M papers worth. And then LLMs can be trained on these formalized results (the ones that turn out upon attempted formalization to have been correct.)

How good do you think AI will be at proving new results given that training set?

Math is going to change, and change massively. There's a lot of whistling past the graveyard going on from those who are frightened by this prospect.

by auntienomen5 hours ago|

prev|

[-]

Exactly this. LLMs really aren't built for discovering new mathematics, especially _interesting_ new mathematics. They're built to try the most obvious patterns. When that works, it's pretty much by definition not interesting.

What LLMs are good at is organizing concepts, filling in detail, and remembering to check corner cases. So their use should help mathematicians to get a better handle on what's terra firma and what's still exploration. Which is great. Proof by it-convinced-other-mathematicians doesn't have a flawless track record. Sometimes major theorems turn out to be wrong or wrong-as-stated. Sometimes they're right, but there's never been a complete or completely correct proof in the literature. The latter case is actually quite common, and formal proof is just what's needed.

by zozbot2345 hours ago|

parent|

[-]

LLMs and interactive theorem provers are vastly different. There are AI models that come up with workable formal proofs for ITPs but these aren't your usual frontier models, they're specifically trained for this task.

by auntienomen4 hours ago|

parent|

[-]

ITPs are far older than LLMs in general, sure, but that's a pedantic distraction. What everyone is talking about here (both the comments, and the article) are ITPs enriched with LLMs to make the "smart" proof assistants. The LLMs used in ITPs are not vastly different from the usual chatbots and coding assistants. Just a different reinforcement learning problem, no fundamental change in their architecture.