Source: https://www.axios.com/2026/05/19/anthropic-openai-karpathy-a...
> Excited to welcome Andrej to the Pretraining team! He'll be building a team focused on using Claude to accelerate pretraining research itself. I can’t think of anyone better suited to do it — looking forward to what we build together!
I couldn’t help myself but consider this mostly a very inefficient variant of hyperparameter optimization, but someone correct me if I’m wrong, I may be looking at this too pessimistic.
> Am I the only one who wasn’t particularly impressed by AutoResearch?
isn't it just a nerfed AlphaEvolve? https://arxiv.org/abs/2506.13131Many people are still deluded and think he is the same person who wrote the informal AI tutorials in plain html. He isn't, he is selling stuff now.
Sure, it can always not work out but that's no more a risk with him than any high-profile hire who doesn't really need the money and will always have other options.
Is that a serious question? He already promoted vibe coding and AI hype. Now he is literally there to promote Anthropic and its IPO price.
When he was at OpenAI it wasn't overtly commercial yet. At Tesla he had a way lower profile. Now he is the vibe coding Jesus for deluded software engineers. The impact is much larger.
?
He was literally rolled out in front of camera as Tesla's AI prodigy at multiple streamed events designed to appeal to techy consumers and dev recruitment. He's definitely been one of AI's public personas for a long time now, and his employers have regularly aided/directed/utilized him accordingly.
(I do understand that for Anthropic it's a brand boost as well, just like signing other prominent researchers, as it was with LeCun and Meta etc).
It might be Elon who went and said that and said they don’t need lidar, but as director of AI and auto vision Karpathy bears the responsibility for those features.
That I also want to know. He bailed out of Tesla right when the limitations of his "LIDAR-less cameras only self driving" system were becoming obvious, and nobody asked him about the hindsight of this obvious fuckup.
>but as director of AI and auto vision Karpathy bears the responsibility for those features.
Exactly. You lead the R&D, so it's on you. If your boss makes stupid decisions in public overriding your best judgement, the leave and go somewhere where your decisions be respected. The ML market was red hot for people like him back then so it's not like he didn't have alternatives.
Although I doubt Elon forced that idea on him, since he's the one who was confidently claiming that vision only is better since Lidar pollutes the sensor fusion data.
Did he never experienced optical illusions? I don't get it.
Elon makes it so easy to hate him as much as to admire. No comparison.
Except the good companies probably dont make you do silly stupid outdated interview practices without the tools you can actually use on the job today, right?
I even share his concern about struggling to keep pace with the rate of change lately, and agree that my working in a frontier lab or any other such environment would certainly help with that!
I have a weird background mix of analytic philosophy, linguistics/NLP, propaganda research, and long-term institutional data science/strategy work, which unfortunately does not make ATS systems especially low-friction as I try to jump industries.
So I keep busy the best I can: lately building tooling around runtime observability, intent legibility, and intervention in LLM systems.
Some small public artifacts finally going up: https://huggingface.co/spaces/anotheruserishere/Cartogemma
Eh. Worth a shot!
There's a choice to be made between helpfully defeating someone's ATS and searching for more clueful employers. I'll probably be walking paper resumes into local offices next time around anyhow.
I learned speed cubing from badmefisto when I was in middle school, ~16yr ago (today my ao100 is ~15s).
I never knew it was Karpathy. What an insane knowledge drop. Thanks for sharing!
Have you considered the possibility that someone you regard as extremely intelligent is speaking from real-life experience and direct proximity when they say another person is smart?
Or perhaps your bias toward Musk make that impossible to even consider.
In hindsight, it's easy to assess that Gates was a charming moron, Jobs was an overeager egoist, and that Altman is a sociopathic liar. All of the white knights defending their boy genius narrative are contradicted by their asinine philosophies, and in Elon's case he's simply undermined by all of his broken promises, random accusations and manic paranoia.
He is both, but it’s irrelevant in this context.
From all the interviews and stuff that came out within the past few years, its pretty clear that Musks only contribution to anything is just throwing money at stuff while also scamming suppliers, governments, and so on. The dude has no technical knowledge of his own, he just lies and lies about what actually happens, and takes credit for accomplishments that aren't his.
When you hear someone who is supposedly smart talk about a dumb person like he is smart, it raises questions on whether that person that is doing the talking is actually smart.
edit: typo
But I don't think it's fair to say Elon is stupid / a bad engineer. When John Carmack speaks well of his talent, I take that seriously.
— Ram, Tron (1982)
Works where archive.ph is blocked, no CAPTCHA, no Javascript, no DDoS directed at blogger
https://assets.msn.com/content/view/v2/Detail/en-in/AA23AbWR...
x=https://assets.msn.com/content/view/v2/Detail/en-in/AA23AbWR/
#tnftp -4o"|grep -o '<p>.*</p>'|tr -d '\134'" $x > 1.htm
#links 1.htm
curl -HAccept: -HUser-Agent: $x|grep -o '<p>.*</p>'|tr -d '\134' > 1.htm
firefox ./1.htm1. Copernican Revolution -> We aren't the center of the universe
2. The Darwinian Revolution -> We aren't the pinnacle of life
3. The Freudian Revolution -> We aren't even in control of our own minds
4. The "Intuitive AI" Revolution -> We aren't the only form of intelligence
I think even a month ago I would've read this article and scoffed, but having used Claude Code almost exclusively at work for the last couple months it seems pretty undeniable that in-context-learning and a good enough harness is all you need to displace most "thinking" jobs that require just a bachelors. The hundreds of billions of dollars pouring into data center build-out basically hinges on this thesis, and frankly I trust the judgements of the billionaires financing these deals better than LLM-naysayers on hackernews (not to mention the non-public info they have access to). You don't need to reach superintelligence to still deeply, deeply affect society, and I think Anthropic was the first to build products that are actually good enough and, critically, hands-off enough to do just this. Every day it's clearer and clearer to me that "I was born into a poor family but am relatively intelligent and good at learning things, therefore I can find success" is exactly what will ultimately be eliminated as the outcome of this unless we get the government to step in and regulate.
I could go on and on, but the main point I'm trying to make is that you should definitely examine unease you feel about Anthropic, consider framing that unease in the context of Hinton's argument, and ask yourself what the implications may be.
2. Most entry level jobs for current graduates in white collar fields. (See hiring rates for these positions)
3. Thousands of layoffs (mostly attributed to AI use, while not 100%, the Anthropic's specific marketing push has a huge influence on this - unlike OAI and other labs)
4. All low-code products/startups
5. Web agencies who did small websites for local businesses
While AI industry push is there for all of the above, Anthropic's specific marketing/PR is specifically directed towards forced adoption of AI and burning tokens, unlike from other labs.
Hmmm… maybe. I think not. It really depends on your other claims below, with which I mostly disagree.
2. Most entry level jobs for current graduates in white collar fields. (See hiring rates for these positions)
Maybe a small amount. Entry level white collar jobs have a low hiring rate for other reasons, imho.
3. Thousands of layoffs (mostly attributed to AI use, while not 100%, the Anthropic's specific marketing push has a huge influence on this - unlike OAI and other labs)
What they say and what the actual reasons are not the same, imho. Correcting for over hiring is the actual main reason.
4. All low-code products/startups
Low-code and no-code products in the hands of someone who doesn’t have a developer’s mind and/or experience usually ends up as a mess, and quickly becomes an unusable mess.
I know of exactly two people who have done successfully used AI to make a low-code/no-code product. One is just highly motivated and wicked smart. The other did a minor in CS a long time ago (works in a different field). Everyone else shows me a pile of garbage and asks me how to fix it (answer: throw it away and start from scratch).
5. Web agencies who did small websites for local businesses
As with 4 above, the only site a local business can make for themselves is one that functions as a business card… at best. Usually it looks more like a business card that a kindergartner made. They simply don’t understand what makes a website good for their business, therefore they cannot direct AI to make it for them.
There’s a lot to criticize about AI, imho, but these aren’t on the list.
So much of what you'd previously pay a "real" freelance developer or web "agency" to build is no less "garbage" than what engineers would call the average vibe-coded web app.
Claude in particular is today really surprisingly good at taking examples and a layperson's description of a website and building something that looks good and is functional.
For obvious reasons, I think many developers/engineers don't want to accept this. They'd prefer to believe that there's something special about their craft that means something produced by AI isn't good enough. But the honest will acknowledge that spaghetti code and crap pre-dated AI.
I know I can code and get better results than most people can with an LLM but I've came to realize that it doesn't matter and people just want to see results (even if they are kind of wrong).
In other words, with the website example, I've realized that even if the agency can do something 10x better, most people will choose to "buy" the AI website just because it's free or super cheap, and that makes me sad
For who?
Similar sentiment shared with other startup founders- check on x about all VCs talking about moats against big labs.
2. Sure, that's one thing.
3. Coefficient Bio is not a thing. They don't have a product. Ever. It's just Anthropic hired 10 people for a ridiculous amount of premium bonus. (Time will prove it's a bad decision, btw)
4. (snorts)
This doesn't automatically make them the great virtuous team. It just means the rest of the pack are toxic as all hell.
I am working on a short story on this topic which is set in 2100s, where most humans have internalized the concept of 'having enough' after the great conflict. But some specimen have started to show signs of this syndrome again, and neuroscientists and psychologists are grappling to understand where it originated from.
There are several. They're in China, releasing competitive open-weight models on a regular basis.
We can only blame ourselves for everything that happens as a result. For instance, the effect of US government sanctions on high-performance GPUs has been to force Chinese researchers to do more with less. It will be years before they can bring their own fabs up to speed, but they now understand that a Manhattan Project level of effort is called for, and their AI labs aren't going to drag their feet in the meantime. This is how we ended up with a 27B model that can run with the big dogs from only one generation ago.
I hope they keep releasing weights, but don't know how optimistic to be about that.
>I believe deeply in the existential importance of using AI to defend the United States and other democracies, and to defeat our autocratic adversaries.
There is no universe where this can be described as anything close to ethical.
The idea of "defend[ing] the United States and other democracies" and "defeat[ing] our autocratic adversaries" are always the stated reasons for US military action. Iraq was certainly an "autocratic adversary" and hundreds of thousands of people died from the war there. Vietnam was about "defending democracies" and resulted in millions of people dying. These are atrocities on an incomprehensible scale.
The ethical objection is very simple. War is evil, and the military is in the business of war.
Especially given the context of these press releases was right at the height of "we'll have Greenland one way or another" pronouncements.
Anthropic showed their belly same as OpenAI anyways.
Anthropic played a really well orchestrated marketing gimmick so that they would be in the headlines for a couple days bringing awareness to non-tech people on how they are supposedly the good guys. They then backpedaled all of this and are in contract with the DoD once the headlines passed.
But this obviously worked as you now believe they are the good guys
Their red lines are still in place. They are the only AI company with those red lines.
[1] https://www.obsolete.pub/p/exclusive-anthropic-is-quietly-ba... [2] https://edition.cnn.com/2026/02/25/tech/anthropic-safety-pol... [3] https://www.wsj.com/tech/ai/anthropic-dials-back-ai-safety-c...
[0] https://www.cnbc.com/2026/05/01/pentagon-anthropic-blacklist...
[1] https://www.techradar.com/ai-platforms-assistants/anthropic-...
This good guy ("AI Safety") versus bad guy is all marketing gimmicks. I'm old enough that it reminds me of Google "don't be evil".
What I find worse is that some people actually believe Anthropic are really the good guys.
AI safety is important. My point is: you should have zero trust in those companies to actually care about AI Safety besides the marketing and PR aspect of it. Incentives matter.
Why should I trust that your assessment is correct? Is it likely to ever be correct in the case of a doctor/mechanical engineer/athlete/economist/whatever? So why do so many people insist that an incredibly intelligent AI researcher has fallen into some obvious trap?
The only time my reality has changed is when I spend time at a computer or on my phone and even then, its a fraction of the total time. So no, it's not a "totally different reality" for me.
Like specifically what has he done?
- At Stanford, Led research on the first (to my knowledge) crop of joint image/text models. Super widely cited work.
- At Tesla, led their whole self driving effort for a while, came up with critical techniques that allowed them to make progress (e.g., the concept of "auto labelling": using a much larger NN to generate training data with which to train smaller models that could fit in the on-device compute. IIRC, Elon said they would not have been able to make progress without this insight).
I'm not sure his educative efforts for the mold of what you're looking for, but if so, the course he designed at Stanford (and availed online):for neural networks, as well as his blog posts, (most famous of which, to my knowledge, is "the unreasonable effectiveness of LSTMs"), made a huge impact on educating a generation of tinkerers and researchers.
I can guarantee you this was built-in from day #1
I'm guessing you're not a developer if you don't then automatically think of end cases like "what if car # 1 isn't in the preceding frame" ... (then you look at some relevant test data and see it was there, unlabelled ...)
---------------------
EDIT: It looks like you deleted the part of your post I quoted below. So feel free to ignore my question about it, I guess.
---------------------
Not sure what you mean by
> Shows how much you know
Do you mean that the fact that I misremembered a word on the title suggests that I know very little about Karpathy's contributions to the field of neural networks?
I was more looking for signal that him + Anthropic might yield something beyond a step-change from Opus 4.7 (disappointing so far). We have not gotten to use Mythos yet, I wonder if that will become Opus 5 or something.
Karpathy pausing Eureka reads more like "I miss being at the frontier and can't replicate that as a solo founder" than "my moat got eaten." Different decision.
It's good that there are avenues today for people to make tens or hundreds of $m in salaried positions at companies so that they don't have to do that stuff to get paid their value if they don't genuinely want to.
Which raises the question: what can he do at Anthropic that he couldn't on his own?
And then there's the uncertainty, will the AI "wars" be some winner-takes-all situation? Will the smaller labs eventually be acquired by the bigger ones, will they simply wash away if there's a crash?
I don't know. If you can land some exceptional gig at the big firms, maybe the financials are good enough to not start your own lab. Minimizing risk, and all that.
EDIT: Assuming such a startup would focus on frontier models.
This is my assumption.
> there's the uncertainty, will the AI "wars" be some winner-takes-all situation? Will the smaller labs eventually be acquired by the bigger ones, will they simply wash away if there's a crash?
He's Andrej Karpathy. He could wait to let the winner surface. Obviously better to get in with the winner earlier. But worse to get on the wrong team versus on the right team late.
And 2 years is probably pretty average for the whole tech industry.
maybe for a fungible CRUD engineer. I think Karpathy is in a different league and I'm certainly surprised to hear this fact. I would expect someone like him to sit within a certain lab for a long time
My impression with no inside knowledge, but understanding what Elon companies are like, is that he was assigned essentially an impossible task at Tesla and tried his very best, but it could not be done, and he semi-burned out. It makes sense for him to be getting back on the horse now.
The Elon approach to management as I see it is to assign what normally would be totally unreasonable goals to a small group of extremely bright people, and they work their asses off and somehow find a way. Sometimes this works, and sometimes it doesn't. If it works and the impossible was in fact, just barely possible, you dominate the market, everyone gets rich, and the people see it as the most exciting, intense, and rewarding part of their career. If it doesn't, they get depressed, divorced, and looking for other work. The Elon magic is threading the needle closely enough that a lot of the seemingly impossible things are in fact possible with enough hard work and brainpower, but although Elon is extremely good at this, the nature of the thing is that you can't predict which side you'll wind up on fully accurately.
I expect the people with low market value to be the ones sticking around labs for long periods of time, they don't have the option to move and they aren't getting poached.
Yes, and it is a problem
> maybe for a fungible CRUD engineer
And there's the cause.
We're in a meat grinder, and there is no $100M payout in sight for most of us
I mean, you always have to take the previous employers' statements with a grain of salt, but if they say they really employed for just that project, it's also good info.
I am not saying any of these don't have valid answers. What I am saying is that we would prefer juniors that are commited and do the hard work when the work gets hard. And, at least where I work now, this gets recognized, and they become seniors in time.
Two years is more than long enough to join a startup, build 3 things, and see that your equity is never going to be worth anything, and find a new job. This isn’t anomalous or weird.
And I work in games and 2-3 years at each company is pretty normal, with some exceptions people just finish a project and then move(or are let go, unfortunately). YMMV of course.
Yeah, being laid off every 2-3 years is a lot different from job hopping and shows exactly why the games industry is in its own little pocket of screwed in this market. Especially with games taking 3-5+ years to be made. How do you keep institutional knowledge when you kick it all out and basically start from scratch every cycle.
-sincerely, another game dev
- barely qualified, leaving to avoid getting PIP'ed
- overqualified/under-leveled, and moving is faster than getting promoted
1) advancements in AI are made by large teams of brilliant people (and individuals who take outsized credit)
2) AGI is defintionless buzz word
3) advancements in AI will need significant changes in either how the model works or fundamentally new non existent datasets.
Claude code was one person's idea as a pet project and now it's singlehandedly 5x'd Anthropic's valuation. Sometimes single people matter, that's life.
Anthropic is a large company, with thousands of employees, and seems to be 100% (maybe 200%) LLM and scale pilled. All the advances from one model generation to the next are the result of dozens of experiments first at small scale then at larger scale, all competing for the same "development compute" portion of their overall "development + inference" compute resource.
In this environment, even if there are researchers who have ideas not on the "LLM + scale is all you need" path that Dario seems hell bent on, there seems to be not much chance that these ideas can compete for resources and compute with the mainstream experiments that the company believes their future depends on.
Maybe an individual developer like Sutskever, engaged purely in research rather than manning a barely turnable oil tanker, can make a difference, but at a company like Anthropic it seems way less likely. Cherny's baby is 100% aligned with Anthropic's mission of selling LLM tokens. Someone else fighting the mission, trying to pivot Anthropic in a new better direction is not likely to have such luck.
I strong suggest you better learn your recent history —- and how generally these things work
im also going to guess that whatever research he does would be free roam research that primarily serves to market the fact that claude was able to help perform the research.
the visible stuff he's been working on has been mostly agent soft skills. off the top of my head is autoresearch and his the wiki knowledge stuff. nothing particularly groundbreaking, but has helped devs expand their understanding of the utility that these models can provide.
not a diss to andrej i know he's reading this now
I think Andrej has the experience (and now ressources) to productionize this research into something very interesting.
p.s. called it
> Karpathy will help launch a new team focused on using Claude itself to accelerate pretraining research — an increasingly important frontier as AI companies race to automate parts of AI development. (https://www.axios.com/2026/05/19/anthropic-openai-karpathy-a...)
They just became "famous" because Karpathy is effectively an AI celebrity, so he could throw shit at a wall and post it on X and it would get 10k Github stars.
But seriously, people have been using the models to tweak hyperparemeters, or using LLMs to help create a second brain using markdown or json files or 100X other combinations of files, for a long time already.
That implies Karpathy is either dumb or desperate and he is neither of those by a long shot.
Generally, when a "good" developer has a huge public presence and reputation, that's quite valuable to a company when they're competing in a tough space. Many a time, more so than the (very high) technical skill of the developer in question.
I've seen large funded companies gather good popular developers like pokemon cards and just have them go around give talks and write blog posts. It creates an aura around them which makes things like hiring, fund raising etc. much easier.
So, it's not really a statement about Karpathy himself. It's more about the company hiring him.
There’s a lot of value for the business world in learning AI from someone who has been at the top of their game but now is doing a general service by being a great educator and translator between the fields.
His recent Wiki approach may be simple to devs but is certainly an aha moment for the rest of the peanut gallery paying attention!
This kind of thing happens to big names in software all the time. Carmack going to Facebook is a prime example - he joined with the idea of using all those resources to build world-changing tech, and instead he ended up headlining conferences, and fighting a losing battle against the corporate types who were put in charge of Oculus.
Andrej seems like a great guy, but him joining Anthropic feels a bit like a transactional relationship (rich old guy marries hot young chick). Anthropic get a "glorified marketer", and he gets a front row seat at SOTA LLM dev 2026. I don't think they hired him expecting he's going to change the direction/pace of their research.
Scandalous!
A regular marriage is transactional to some extent too right, but not quite the same as Anna Nicole Smith marrying a 90yr old.
As an aside, an Indian guy I used to work with once explained to me how traditional Indian arranged marriages, like his own, work, and they are HIGHLY transactional. It's not just a matter of same caste, same social status etc, but an explicit trade off. In my co-worker's case he cheerfully told me how his wife was very dark skinned, therefore considered not that attractive/desirable (to other Indians!), but her family had money and social status so it was considered a fair trade for a nice looking boy like himself!
No it implies that he is more valuable for being famous than the hands-on work he can produce. This is the IC endgame
"Improve yourself, no mistakes" in a loop. Woooah sooo revolutionary...
Last thing I saw Karpathy talk about was this, which I find hard to believe that it came from a smart person.
And, my objection was that he clearly had no understanding of the supply-chain risk he was worsening by advocating widespread use of Obsidian for agentic engineering tasks.
Since his announcement, Obsidian has taken proactive steps to mitigate the risks, or at least study threat model. Hopefully, they will implement proper RBAC or something before someone else with his visibility announces an even more irresponsible half-baked idea.
But he has always been known for his communication rather than his research. He got famous by putting out a (very well made) course on machine learning that was available to the public. Since graduating he hasn't exactly delivered on revolutionary new stuff at the businesses that employed him but he has continued to be extremely good at communicating thoughts about the current and future state of AI. Businesses want that and he knows that he can deliver that.
There are things that you can only explore and learn in those places, for obvious reasons.
I don't know his personal life goals but he's a great communicator and educator, if this decision makes him more up to date, and allows him to create even more relevant content then is something everyone will benefit. I understand the risks of being bias toward one company and not the other, but if you look at the content he created so far, he always talk principle first and specific tool later.
I think people here should give him the benefit of the doubt.
his value to Anthropic is his influence..he has over 2 million followers, and value is that he is the Top influencer for AI right now, like it or not. just like Selena Gomez might be for top for women age 21-29...
Every AI nerd I know reposts his (very thoughtful posts and projects mind you) like religon
meanwhile in the real world:
claude --permission-mode=auto --model=opus -p '/onboard --user=karpathy'All we hear is Altman, Musk, ...
Reason? What is the value of that other than entertainment? And it's not in the interest of companies to make celebrities that then are poach targets (if they can avoid they would yes there are exceptions as noted elsewhere in this thread).
And if you did 'hear' (via articles) to what extent what was said even be correct vs. a writer just fluffing things up to the max.
Tech is not sports where you can actually see the superlatives and know that the person who praise is being lavished on actually won or threw or caught and so on. (Or even music where you can hear it and see the stadium that is packed with fans..)
I suppose that with modern ML they can just toss it in the blender and reap the benefits ...
Apollo Go 100k driverless rides a day.
Tesla 0-5? driverless rides a day.
Sorry I'm out of the loop... What inflection point are you referring to?
Around the time Karpathy left, Ilya Sutskever, another OpenAI founder, started playing with Google's new "Transformer" architecture, which was the beginning of the "GPT" series, GPT-1, GPT-2 and eventually ChatGPT (GPT 3.5 + RLHF). In retrospect OpenAI's early Transformer experiments and GPT-1 was the inflection point that moved OpenAI from a company that wanted to build AI, as soon as anyone else did, to one that was actually doing so, although I think it would be revisionist for anyone to claim they knew what they were doing at the time. The early GPT-1 and GPT-2 papers read more like "wow, this is a bit unexpected, look at all of the things it can do!".
So pretty sure the original poster is talking about 2017.
not everyone does things to be rich.
And tesla is not a good place for science development. Tesla is structured from narcissistic mindset: results driven, cynical, and position-based. This doesn’t bode well for long term sciences.
I dont see how he could be helping anthropic
OpenAI’s hiring recently has been much stronger, whether through luck or structure. The “no-name” guys have actual taste. I love that. I don’t care that they’re no-names.
I don’t know Karpathy personally, I won’t speak bad about a man I don’t know. I hope he makes CC better. I just read this as hype. My point is that there’s nothing he has that an empowered no-name product manager doesn’t. It’s like Alex Wang at Meta. That acq didn’t redeem Meta. They actually lost LeCun. Where’s Llama today?
Regardless of what Anthropic’s share price is, OpenAI has been fucking killing it recently. I don’t take particular pleasure in saying that, i’ve been a google and gemini guy for years
My lens is meritocratic. My experience is as an extremely heavy user of both company’s full suite of products in the range of 5 digits per month. My interest is better products not hype.
Can you cite specifics? "I won't speak bad about someone, but also won't speak good about others" resulted in a comment that seems to contribute nothing
I’m hoping Karpathy will make Claude Code better, in the meantime I’m super happy seeing a small product manager like Tibo fucking crushing it on Codex
My point is that product velocity is visible in shipped workflow improvements, not prestige hires
Prestige is fickle, look at academia today
Joking aside, there are small communities pushing codex and AI to the bleeding edge of what's possible.
Here I'll give you an example. The last few updates from Boris at CC have been tweaks to the system prompt to make it use less compute, effectively making the system dumber, making it tell you to go to bed. I mean come on! Tibo has been impressing me, bc they're building the things these small communities are building.
One of the things these bleeding edge guys and girls have been working on is a /goal feature, essentially ralph loops. Codex released it as a feature the other day. I can't help but be impressed. As an ex-pm, this is product management.
Then you take a look at what the Chinese are doing on their own forums, and it just makes what Google and Anthropic are doing look outdated. OpenAI feels competitive, which I like. What's coming will not be kind to us, we adapt or we die.
I am sure there is an element of reality in it's capabilities, but there's also a significant amount of "We don't have the compute to handle this at scale", and "look look, we have the best model. It's so good that you can't even compare it to other models. That is how good we are."
The Claude maximalists that can never see any wrong in anything and the users that care about actual capability
These guys are going to be in for a rude awakening when the Chinese are steamrolling us with data centers you can see from space and better models, Amodei will tell you that himself
Adapt or die
What codex is a few steps away from doing is changing fundamentally a lot of workflows.
Remote codex with their computer use is basically you at your computer doing things, 24/7.
Then they added gpt images 2.0
what codex can do, in a few more product iterations, is show you visually side by side “would you prefer this (A) or that (B)” in a series of questions. This is what some open source researchers have been up to. That’s no longer guessing.
I’m not trying to hype a company i have no stake in, but they’ve been killing it.
It’s extremely compute intensive, but also very satisfying.
Example 1, just from top of my mind, Composer 2.5 released today. Go look at their benchmark.
Composer 2.5 and Opus 4.7 ranked around the same, meanwhile gpt-5.5 was miles ahead.
You wouldn’t have caught me dead using a gpt model 2 years ago
They are all going to get their lunch eaten by the Chinese.
In the USA with access to most of the world's capital, they've succumbed to the temptation of "bigger, faster, harder"
Whilst the Chinese, with enough capital only, have had to think.
The Chinese models are already miles ahead on cost/inference basis and will probably pass all the USAnian companies in five years
The age of UASnian engineering dominance are coming to an end.
Let's all hope she goes quietly - not at the moment
He had both the technical and executive authority to determine if the product was fit for customer usage. He had direct executive responsibility for the product on the road between 2017-2022.
If he, the lead architect and executive responsible felt the product was dangerous and then he was overridden, he can not get away with claiming he was “just following orders”, he had a moral duty to not sign-off or quit otherwise he is clearly complicit in deploying a dangerous product for his own self-enrichment.
When people talk about engineering ethics, this is literally a completely uncontroversial textbook example. The only way you accept this is if you do not want ethics in engineering.
Furthermore, he was extremely hireable with numerous job opportunitys available to him. He would not be destitute or even particularly worse off if he did quit for ethical reasons. Any self-preservation defense is also invalid.
[1] https://techcrunch.com/2017/06/20/tesla-hires-deep-learning-...
He heard Elon say “I drive with eyes, so cars just need eyes” & shipped?
:( happy to have my impressions corrected (but I was kind of pretending it’s a 2026 scenario where you could slap Lidar, ship a Waymo, if you were just willing to spend the friggin MONEY - 2017 was too early for most any “self” driving IIRC)
-
*edit - in a scenario where his refusal to skip Lidar catalyzed change
"2. an ability to apply engineering design to produce solutions that meet specified needs with consideration of public health, safety, and welfare, as well as global, cultural, social, environmental, and economic factors." "4. an ability to recognize ethical and professional responsibilities in engineering situations and make informed judgments, which must consider the impact of engineering solutions in global, economic, environmental, and societal contexts."
https://www.abet.org/accreditation/accreditation-criteria/cr...
Unfortunately, rather important courses like engineering ethics have become lumped in with mandatory DEI objectives and similar 'grievance studies' requirements, classes which many suffer through quietly, regurgitating the Correct responses while they count the minutes until they can get back to more substantive classwork. Some undergraduates may unfortunately gloss over ethics just as they gloss over lectures on privilege.
When rumors started that GPT-4 design would be kept secret, he likely wanted to know what architecture it would be. Perhaps he left Tesla, waited out the non-compete clause, and joined OpenAI to learn its details.
When Mythos dropped, there were hints that it had a new architecture. He might similarly want to know how it works.
Either way, there is enough cross-lab hiring that those secrets eventually get known, but only by the labs.
It actually feels like a signal that it is in a tapering phase.
As in, if it was in a growth phase a freeform, solo - collab with who you want, would be more beneficial. But in a tapering phase you'd want structure and to be in the private formal meetings.
just an idea
Growth is when you want to have institutional support, to be at the tip, backed by infinite money and best compute infra, and benefit disproportionally from compounding. Conversely tapering is when you're best flying under the radar, and there's plenty of value both in ideas and in hardware, as the leading players shed excess they can't support anymore, ...
But to your point, then the growth is not in the ideas that can be generated with AI, and more in the structure. Which feels like a different stage. Maybe "growth", wasn't a good word.
Stuff is still happening and you need to be part of a big lab to see it. NanoGPT is fun but at some point you need that datacenter.
I also feel like there are many ways he can access compute for use of his own ideas.
So, does Anthropic pivot to military tech or pretend to do so before the IPO?
Or is this simply a deal where he uses his formidable influencer skills for Anthropic and gets to cash in on the IPO?
OpenAI looks a lot more like early Yahoo -- earlier, quite a spectacle at first, definitely a game-changer and disruptor, but overspent, less focused, and subject to slow collapse under its own fragmentation and lack of overwhelming clarity of mission and purpose.
All that said, history rhymes but does not repeat, and trying to map present-day companies onto previous generations is an exercise in futility. The future is fundamentally unique.
Of course, there could be some future lab or startup which completely revolutionizes the field by going for some approach that doesn't require a boatload of money to train a model, but for now, we're stuck with the LLMs and the costs they come with..
Looking familiar: VTI or VOO, VTSAX or VFIAX
From Karpathy's various interviews I get the impression that he wants to leave the door open to working for Musk again at some point, perhaps on TeslaBot.
With Musk for now regarding Anthropic as a partner (or at least an enemy of his enemy), that seems to mean that Karpathy joining them is less likely to anger Musk than might otherwise have been the case. Who knows, maybe Karpathy was involved in brokering this data center deal?
Compare and contrast with working at OpenAI, Google, etc.?
- best harness overall (well maybe until like a month ago when gpt5.5 and codex came out)
- acquires bun
- acquires stainless for SDKs
- deal with Elon for compute
- karpathy
what else did I miss?
All gone, for shitty typeahead
2. Mixed for the entire bun ecosystem, especially with the Rust, Anthropic-focused rewrite
3. Good, because Anthropic's SDK was one of the worst ones to use.
4. Deal with the guy that has a shit ton of compute around wasting money because no-one uses Grok and was frequently calling Anthropic "Misanthropic".
https://i.redd.it/kp4uy1egspjg1.png
5. Glorified marketer whose probably greatest achievement in pushing AI forward was instructing on CS 231n and coining the term vibe coding.
Yeah, on a roll.
1. Claude Code is widely used and beloved despite not benchmaxxing on the terminalbench like these harnesses that nobody has ever heard of or uses.
5. Karpathy's contributions are way more than CS 231n and coining vibe coding. In terms of pedagogy, his "zero to hero" videos, nanoGPT, etc, are all great. For actual work, he also built a great org at Tesla.
(I also assume they gave him a ton of independence in R&D)
You don't need to live in the bay area, most civilized places on earth let you live a comfortable life with a 10th of the salary, plus you are not selling your soul.
We are doing interesting R&D in other fields too, in places you would never believe.
Do you really want to be the person that hands this kind of tech to corporates? Or that does anything to benefit those corporates?
Maybe the IPO potential was just too great to ignore and maybe AGI (A Giant IPO) is around the corner.
I, personally, don't think there will be a better time for researchers to make so much money in a few years in any future of LLMs.
Leaving OpenAI to work for Elon Musk was a poor move, and AFAIK his work on CV at Tesla did not bring anything groundbreaking, unfortunately probably the opposite (the bet on camera-only driven system did not pay off) and his talks about the approach would indicate that his whole idea to make it work was nothing more than hill-climbing.
Also, his over-reaction to the whole Claw thing was a bit ridiculous, in my opinion.
I don't see him as a Scientist in the field, but more as an efficient tinkerer.
This is a pretty unsubstantiated claim. Tesla is now launching robotaxis at a fraction of the cost of Waymos, in part because they don't need all the Lidar.
But Tesla has been promising full self-driving "next year" for quite a long time now, and it seems they are stuck at the "95% there" stage basically forever.
Anthropic: Okay, let's add two zeroes
Andrej: I am very excited to join Anthropic!
(I do not blame him, I think this is reasonable, I find the whole money-falling-from-the-sky thing amusing :-)
I hope he still gets to do some educative stuff on the side too
But - unpopular opinion - I believe Anthropic is one open-source model away (that can code well) from a massive revenue/stock crash. We're already seeing Claude's cost escalate to astronomical levels. Most coding work is medium difficulty in the grand scheme of things. So the future is an open source model small enough to fit in your local 16GB VRAM, giving you a Claude Code like experience for zero token cost. That's going to wipe out most of Anthropic's current revenue base. It does have several cool initiatives in the pipeline, but bad things happen once your bread and butter is threatened (just ask OpenAI).
(If they were just burning Capex and nobody wanted to use their product or their gross margins were bad then I'd agree with you)
One thing is that the companies are holding on because of competitive advantage, and I think another is that AI is such a politically polarizing topic that actually being open about everything is risky for the companies, wanting to avoid controversy.
I have no idea if Andrej "sold out" but perhaps he realizes that if he wants to work on the cutting edge alongside talented people, with a seemingly endless budget, Anthropic is a good choice.
I chose my employers for the same reason; the compensation was secondary.
There's some poetry there that I am unable to capture with words.
When I left MS, a full Windows build was about 18M LOC. The fact that 18 million lines of code, written by tens of thousands of engineers, worked at all was a mini miracle.
With regard to compensation: like Karpathy, I had already earned enough to be comfortable for the rest of my life. Once money stopped being the primary driver, I was able to focus on what made me happy. Building things, even if you don't like them, brought me happiness and fulfillment. I hope Andrej finds the same at Anthropic.
If money was not an issue he could just build that environment for himself.
I mean short gig, few million dollars for Karpathy so makes sense for him but others should read the Cloudflare's report about the super scary model that Anthropic wouldn't release because they love humanity more than their balance sheet.
"However, it turns out, my deep passion can easily be put on hold with money. Also I'm not really sure what the definition of passionate is."
“According to reports from The Wall Street Journal and The Guardian, the AI model Claude, developed by Anthropic, was used in the initial U.S.-Israel military operations against Iran in late February 2026. The system, integrated into a platform developed by defense contractor Palantir, assisted with intelligence analysis, scenario planning, and targeting for strikes that resulted in the death of Iran's Supreme Leader, Ayatollah Ali Khamenei”
https://biggo.com/news/202603032121_Anthropic_Claude_AI_Used...
Skynet is winning.
But what is the solution? I don’t think it is safe for a society built on free speech and other liberal values to have a couple extremely powerful companies controlling all our information and imposing their rules and their politics on top of us. It was bad enough under the FAANG companies. This will be worse.
Personally I’m not comfortable with how much power Anthropic is accumulating. And with them partnering shamelessly with Elon Musk to use a datacenter powered by potentially illegal natural gas turbines, I feel like Dario is just not trustworthy.
Andrej Karpathy - @karpathy
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
May 19, 2026 · 3:05 PM UTCAnthropic should capitalize on this opportunity to undercut their competitor’s platform. They don’t come around often.
(vs an HN blog author),
networks are communication and dissemination tools,
not principled stances.
Super offending to my sensibilities seeing the extent of slop in replies, and this is months and months ago now. Unbridled poorly prompted GPT-4o replies.
The main posts from smart/funny people are just as good as they would be written elsewhere, yes, but like at a restaurant, atmosphere’s pretty important too… don’t want to eat a tomahawk steak on an airport runway (whether or not the airport’s associated with My Heart Goes Out To You non-Roman non-salutes)
If you can screenshot all the good stuff and put it on Mastodon, thank you! ;) (hehe no perfect solutions to this thang)
[1] omg someone prompted GPT to use uncommon words and lowercase letters, and they posted the stupidest model output I’ve ever seen as “their” reaction… it was super disrespectful to make humans read even those few contrived sentences
Karpathy’s so smart and he has to deal with the reply quality we see on XCancel there… one click away from Hacker News and suddenly every reaction is trash instead of insight and deep insight we see here
(Plus the trash I post, but we’ve got some range, not monotonous spam & model output)
… Memphis/Southaven residents will get more air pollution.
Nobody who isn't fully on board with white supremacy (et tu, Andrej?) should be using Twitter at this point.
My “entertainment”, or intrigue, comes from the ability to impact my life.
Other people sporting struggles to catch my attention longer than the play itself, for that reason.