undefined

upvote

points

by vatsachak6 hours ago |

upvote

by trostaft5 hours ago|

[-]

> A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.

However, this was not verified in Lean. This was purely plain language in and out. I think, in many ways, this is a quite exciting demonstration of exactly the opposite of the point you're making. Verification comes in when you want to offload checking proofs to computers as well. As it stands, this proof was hand-verified by a group of mathematicians in the field.

reply

upvote

by vatsachak4 hours ago|

[-]

Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean.

reply

upvote

by trostaft4 hours ago|

[-]

Arguing similarly to how stockfish, the chess engine, trains I would not be surprised if this is more common in the future. I don't know if they use any proof verification tools during their reinforcement learning procedure right now, as far as I know they've been focusing more on COT based strategies (w/o Lean). But I'm hardly an LLM expert, I don't know.

reply

upvote

by vatsachak2 minutes ago|

[-]

They most definitely threw in rl with formal verification somewhere between GPT 4 and now. The models are better at not hallucinating. I don't think their IMO team are only show ponies...

reply

upvote

by ComplexSystems4 hours ago|

[-]

That may be true for now, but it seems clear enough that letting the model use Lean in its internal reasoning process would be a great idea

reply

upvote

by trostaft4 hours ago|

[-]

That I'd agree with! I really need to get around to learning Lean myself. It might be interesting to try and formalize some missing theoretical pieces from my field (or likely start smaller).

reply

upvote

by Terr_5 hours ago|

[-]

> manage a McDonald's

Dystopia vibes from the fictional "Manna" management system [0] used at a hamburger franchise, which involved a lot of "reverse centaur" automation.

> At any given moment Manna had a list of things that it needed to do. There were orders coming in from the cash registers, so Manna directed employees to prepare those meals. There were also toilets to be scrubbed on a regular basis, floors to mop, tables to wipe, sidewalks to sweep, buns to defrost, inventory to rotate, windows to wash and so on. Manna kept track of the hundreds of tasks that needed to get done, and assigned each task to an employee one at a time. [...]

> At the end of the shift Manna always said the same thing. “You are done for today. Thank you for your help.” Then you took off your headset and put it back on the rack to recharge. The first few minutes off the headset were always disorienting — there had been this voice in your head telling you exactly what to do in minute detail for six or eight hours. You had to turn your brain back on to get out of the restaurant.

[0] https://en.wikipedia.org/wiki/Manna_(novel)

reply

upvote

by kmeisthax5 hours ago|

[-]

Casual reminder that the author's proposed solution to the labor-automation dystopia is to invent a second identity-verification dystopia. Also casual reminder that the author wanted the death penalty to anyone over the age of 65.

reply

upvote

by embedding-shape2 hours ago|

[-]

I was curious about this book but now you've absolutely sold me on it, sounds like I'm in for a ride!

reply

upvote

by Lerc5 hours ago|

[-]

I disagree. It will be able to perform work deserving if a fields medal before it is capable of running a McDonalds. I think it will be running a McDonalds well before either of those things happen, and a fields medal long after both have happened.

reply

upvote

by c7b5 hours ago|

[-]

One could hardly ask for a task better suited for LLMs than producing math in Lean. Running a restaurant is so much fuzzier, from the definition of what it even means to the relation of inputs to outputs and evaluating success.

reply

upvote

by vatsachak4 hours ago|

[-]

Not necessarily. Obviously playing Kasparov on the board requires more planning ability than managing a McDonald's but look at where chess bots are now.

There's much more to being human than our "cognitive abilities"

reply

upvote

by baq4 hours ago|

[-]

Conjecture: the first AI to successfully manage a McDonald’s will be a Gemini.

reply

upvote

by edbaskerville5 hours ago|

[-]

I just visited a McDonald's for the first time in a while. The self-order kiosk UI is quite bad. I think this is evidence in favor of the idea that an incompetent AI will soon be incompetently running a McDonald's.

reply

upvote

by jldugger1 hours ago|

[-]

>The self-order kiosk UI is quite bad.

Most repeat customers use the app, which sports the digital equivalent of a loyalty program, and various coupons. And lets you save your 'usual' order with customizations etc. Plus the annoying push notifications for FreeFrydays or whatever. And upsells, new product launches, etc.

My recollection is that the kiosk is just a weak facsimile of the app. And wasn't terrible, but everyone's standards vary.

reply

upvote

by Silamoth5 hours ago|

[-]

Out of curiosity, what issue did you have with the McDonald’s self-order kiosk? I actually think McDonald’s has the best kiosk I’ve ever encountered. The little animation that plays when you add an item to your cart is a little annoying (but I think they’ve sped that up). But otherwise, it’s everything I’d want. It shows you all the items, tells you every ingredient, and lets you add or remove ingredients. I have a better experience ordering through the kiosk than I do talking to a cashier.

reply

upvote

by ndiddy4 hours ago|

[-]

It takes longer than ordering with a cashier, it keeps trying to upsell you, and it's always out of receipt paper because unsurprisingly the company that isn't willing to pay a person to take orders is also not willing to pay a person to maintain the kiosks.

reply

upvote

by Sohcahtoa822 hours ago|

[-]

> It takes longer than ordering with a cashier

Depends on what you're ordering and who the cashier is.

If your order is the happy path of no customizations of a combo with an experienced cashier, it can be done in seconds, for sure. "Medium #4 with a Diet Coke", pay, done.

But if you customize your burger or ordering a lot of items a la carte and you're dealing with a new cashier that has weak English skills, good fucking luck. You'll likely need to wait for them to figure out they need to call someone over to help, have to repeat your order, and you end up spending far more time.

> it keeps trying to upsell you

Yeah, I'll agree that's obnoxious, especially when it's trying to upsell you something that's already on your order. I ordered a combo. I don't need you to add another fry.

reply

upvote

by Silamoth3 hours ago|

[-]

Hmm. I’ve never really had those issues. It’s also much faster and easier than ordering with a human. I guess it does try to upsell you, but humans often do, too. And to me, it’s worth it to just click “No” in exchange for the added convenience (mostly in getting my order right).

I have had them run out of receipts, but it’s never mattered for me. If I’m dining in, the plastic number you carry to your table makes sure I get my food. And if I’m taking it to-go, they always find me anyways.

reply

upvote

by ryandrake2 hours ago|

[-]

> It’s also much faster and easier than ordering with a human.

I'm not sure how that could be. I can walk up to the counter and say "Big Mac Large Fry Small Coke" faster than you can navigate the first screen of the kiosk, and a skilled counter worker can key that in and be done before I even get my credit card out.

reply

upvote

by Silamoth1 hours ago|

[-]

The problem, I’m a picky eater. I never order something that simple. I always need it with “No X” or “Only Y”. Cashiers often struggle with that, even if they understand me well (which they don’t always). It’s easier for me to see everything an item comes with and make sure I’m entering my order correctly.

reply

upvote

by hunterpayne25 minutes ago|

[-]

Then why are you eating at McDonald's?

reply

upvote

by marknutter2 hours ago|

[-]

It's easily one of the most intuitive and straightforward kiosks out there today and you don't have to wait for one of the cashiers to notice you nor worry about them punching in your order incorrectly.

reply

upvote

by Silamoth2 hours ago|

[-]

Glad someone else feels the same way! Knowing that I enter my order in correctly is the biggest win there for me as a picky eater. The cashier is just entering it into a computer anyways, so it makes sense for me to enter it in myself. I honestly wonder why more restaurants don’t do this. It’s not that hard to wrap a halfway decent UI around the system you already have.

reply

upvote

by 5 hours ago|

[-]

deleted

reply

upvote

by evenhash5 hours ago|

[-]

The proof is not written in Lean, though. It’s written in English and requires validation by human experts to confirm that it’s not gibberish.

reply

upvote

by vatsachak4 hours ago|

[-]

Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean

reply

upvote

by auggierose3 hours ago|

[-]

> A difficult part was constructing a chess board on which to play math

We have that chess board for quite a while now, over 40 years. And no, there is nothing special about Lean here, it is just herd mentality. Also, we don't know how much training with Lean helped this particular model.

reply

upvote

by sigmoid105 hours ago|

[-]

Managing a McDonalds is a question of integration and modalities at this point. I don't think anyone still doubts that these models lack the reasoning capability or world knowledge needed for the job. So it's less of a fundamental technical problem and more of a process engineering issue.

reply

upvote

by dap1 hours ago|

[-]

Have you not seen:

https://www.anthropic.com/research/project-vend-1 https://www.wsj.com/tech/ai/anthropic-claude-ai-vending-mach...

(Two different examples of a similar idea)

reply

upvote

by andy12_4 hours ago|

[-]

I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds.

reply

upvote

by throw-the-towel5 hours ago|

[-]

The capability they lack is being able to be sued.

reply

upvote

by pear015 hours ago|

[-]

Police officers are human. In the United States in the vast majority of cases you can't sue the police, only the community responsible for them.

https://en.wikipedia.org/wiki/Qualified_immunity

Assuming you can still sue McDonalds I am not sure if this is a problem in the robotic llm case. I'm also trying to imagine a case where you would want to sue the llm and not the company. Given robots/llm don't have free will I'm not sure the problem with qualified immunity making police unaccountable applies.

There already exist a lot of similar conventions in corporate law. Generally, a main advantage of incorporation is protecting the people making the decisions from personal lawsuits.

reply

upvote

by nemomarx5 hours ago|

[-]

McDonald's are franchises - you generally want to sue the local owner or threaten them in addition to the holding company.

That only requires someone own the ai managed McDonald's though. so long as they can't avoid responsibility by pointing to the AI I don't see why you couldn't sue them.

reply

upvote

by lancekey3 hours ago|

[-]

25/75%. Plenty of stores are owned directly by McDonalds corp.

reply

upvote

by logicchains5 hours ago|

[-]

>Police officers are human. In the United States in the vast majority of cases you can't sue the police, only the community responsible for them.

Police are a monopoly; nobody has a choice about which police company to use. McDonalds are not a monopoly, and many customers would prefer to eat at competitors run by entities that could be sued or jailed if they did anything particularly egregious.

reply

upvote

by pear014 hours ago|

[-]

You are missing the point. The point is you can still sue the McDonalds. With the police there is a human intuition to also want to sue the officer, given the officer is a human being who has free will and thus made a choice to violate your rights.

The same intuition applies if you walk into McDonald's and a person there mistreats you. You want that person held responsible.

But the LLM is not a person. What is there to even sue? It just seems like it would simply pass through to the corporate entity without the same tension of feeling like we let a human get away with something. Because there is no human, just a corporation and the robot servicing the place.

Put another way - if the LLM is not a person, what is the advantage of a personal lawsuit?

Just sue the McDonalds. Even in a case where the LLM is extremely misaligned and acts in a way where you might normally personally sue the McDonald's employee, I'm just not sure the human intuition about "holding someone accountable" would have its normal force because again - the LLM is not a person.

So given we already have the notions of incorporation and indemnification it doesn't make sense to say what is precluding LLMs from running McDonald's is they can't be sued. If McDonald's can still be sued, then not only is there no problem, there is very likely not even a change in the status quo.

reply

upvote

by parineum51 minutes ago|

[-]

> given the officer is a human being who has free will and thus made a choice to violate your rights.

The purpose of qualified immunity is for when an officer does something that turns out to be illegal but they were both told to by their superiors and did not think it was in violation at the time.

An officer making a choice to violate your rights would not be eligible for qualified immunity.

reply

upvote

by KalMann4 hours ago|

[-]

I think your analogy is good but I don't believe modern LLMs use Lean or any lean-like structure in their proofs. At least recent open source ones like DeepSeek can do advanced math without it (maybe the most cutting edge ones are doing it I can't say).

reply

upvote

by vatsachak37 seconds ago|

[-]

They are most likely using them in training. I doubt their IMO team are show ponies

reply

upvote

by volkercraig4 hours ago|

[-]

> we'll see more specialized math AI resembling StockFish soon

Heuristically weighted directed graphs? Wow amazing I'm sure nobody has done that before.

reply

upvote

by vatsachak4 hours ago|

[-]

My claim is that LLMs waste a lot of time training on all available data.

Math is a sequence of formal rules applied to construct a proof tree. Therefore an AI trained on these rules could be far more efficient, and search far deeper into proof space

reply

upvote

by red75prime3 hours ago|

[-]

It has been tried. Lenat's Automated Mathematician, for example. The problem is that the system succumbs to combinatorial explosion, not knowing which directions are interesting/promising/productive. LLMs seem to pick up some kind of intuition from the data they are fed. The generated data might not have the needed "human touch" or whatever it is.

reply

upvote

by forinti5 hours ago|

[-]

AI is already too old for that.

reply

upvote

by whimsicalism5 hours ago|

[-]

the only thing keeping the mcdonalds from happening will be political, likewise the same with fields medal

reply

upvote

by segmondy5 hours ago|

[-]

our local AI models are already capable of running McDonalds.

reply

upvote

by ori_b4 hours ago|

[-]

We're automating art and science so that we can flip burgers. This future sucks.

reply

upvote

by dyauspitr19 minutes ago|

[-]

No, we’re not going to be flipping burgers either, they will have physical robots for that. 20 years down the line I wonder what work all of us will be doing.

reply

upvote

by vatsachak4 hours ago|

[-]

Math is a very specialized subset of art and science more amenable to automation.

reply

upvote

by ori_b2 hours ago|

[-]

The first thing we automated passably was art, even before programming. Were you not paying attention?

This future still sucks. The tech industry is making the world a worse place.

reply

upvote

by soupspaces5 hours ago|

[-]

Lee Sedol, Move 37 https://www.reddit.com/r/singularity/comments/1l0z5yk/the_mo... Edit: I wasn't necessarily disagreeing. But on second thought the chessboard in this math analogy is being built, not just played in. This Hardy quote comes to mind https://www.goodreads.com/quotes/902543-it-proof-by-contradi...

reply

upvote

by vatsachak4 hours ago|

[-]

My claim is that we haven't even witnessed the move 37 of math yet. I am claiming that math AI is going to get even better

reply

upvote

by dyauspitr4 hours ago|

[-]

Nonsense. Have you been watching the figure live stream? Or the Unitree video from yesterday with real time novel action generation? We’re less than a year away. If you can cook a burger, assemble a sandwich and clean up surfaces you’re all of the way there.

reply

upvote

by vatsachak4 hours ago|

[-]

Fair. Let's see in a year. I'm willing to bet that nothing happens.

reply

upvote

by dyauspitr4 hours ago|

[-]

Yeah, it’s gonna be an exciting year. I still think you’ll need one human in there, but that’s about it.

reply

upvote

by 5 hours ago|

[-]

deleted

reply