upvote
Interesting how the concept of a clean room implementation changes when the agent has been trained on the entire internet already
reply
To the best of my knowledge, there's no Rust-based compiler that comes anywhere close to 99% on the GCC torture test suite, or able to compile Doom. So even if it saw the internals of GCC and a lot of other compilers, the ability to recreate this step-by-step in Rust is extremely impressive to me.
reply
The impressiveness of converting C to Rust by any means is kind of contingent on how much unnecessary unsafe there is in the end result though.
reply
None - all references to 'unsafe' are in comments about the codegen: https://github.com/search?q=repo%3Aanthropics%2Fclaudes-c-co...
reply
deleted
reply
Agreed, but the next step is of having an AI agent actually run the business and be able to get the business context it needs as a human would. Obviously we're not quite there, but with the rapid progress on benchmarks like Vending-Bench [0], and especially with this teams approach, it doesn't seem far fetched anymore.

As a particular near-term step, I imagine that it won't be long before we see a SaaS company using an AI product manager, which can spawn agents to directly interview users as they utilize the app, independently propose and (after getting approval) run small product experiments, and come up with validated recommendations for changing the product roadmap. I still remember Tay, and wouldn't give something like that the keys to the kingdom any time soon, but as long as there's a human decision maker at the end, I think that the tech is already here.

[0] https://andonlabs.com/evals/vending-bench-2

reply