upvote
Ever since AlphaEvolve - the idea that if you build a harness which can evaluate solutions and give LLMs a database where they can keep storing their work and then sample from it - they do find non-trivial solutions over time leaning from their own past ideas.

It is the ultimate manifestation of test-time scaling. I think karpathy just popularised it.

reply
I didn't dig into what the actual repository was doing, but personally, I took some inspiration from the idea after reading about it and realizing that I might have been underestimating the ability of LLMs. I put a bit more work into a performance harness I was using locally and just set some agents to brainstorming and they did seem to find some great stuff. So I don't really have a stance one way or another on this specific repo, but the general idea seems like a really good one.
reply
Karpathy embedded within an organization is way more impressive than him out on his own with hot takes and little projects. I hope he does great things for Anthropic.
reply
Absolutely, I wasn’t saying that him being at Anthropic wasn’t going to be effective, I just think his little projects wouldn’t be very interesting if his name wasn’t attached to them.
reply

    > Am I the only one who wasn’t particularly impressed by AutoResearch?
isn't it just a nerfed AlphaEvolve? https://arxiv.org/abs/2506.13131
reply
Inefficient variants with $100m+ worth of compute will still probably outperform the best team of researchers
reply