Assuming 10x on the speed of dev, Is the vscode repo a decent example? Recently they've been all in on AI augmented development so i'm thinking they'd be a reasonable subject?
How do you isolate out what counts as the "development" part of their delivery cycle (is that the dev inner loop, does that show up in frequency of commits then?) to measure it and see if it's running 10x?
https://github.com/microsoft/vscode/graphs/contributors?from...
AI is not delivering 10x shareholder value, anywhere. Software developers have quite the level of hubris about how important they are to companies. Yes our work is very complex and takes a certain mindset to do it well. It takes a lot of other roles to have a successful business, many of those roles will use AI to help draft slide decks, emails, etc. and that's the limit for them.
Look at recent companies doing layoffs claiming its because of AI, like CloudFlare and Coinbase, do their reported financials paint the picture that they are crushing it with AI? No, its net losses into the $100's of millions.
One of the latest things I made with Claude was a tool that allowed me to move a bunch of very low traffic Cloud Run services to a single VPS without losing any of the Cloud Run benefits such as easy Docker-based deployment and automatic certificate provisioning. I thought about making something like that for quite some time, and Claude finally made it possible, which makes me quite happy.
The fun thing here is that no other soul genuinely cares about it, or any other code I might publish. The code, especially AI generated, is so cheap that if anyone wants to repeat my steps to get rid of Cloud Run services, they will probably vibe-code their own tool instead of figuring out how to use mine, just like I did that instead of spending time on learning Dokku or similar solutions.
So, yes, 10x and more, but no one cares about the result, which makes the whole 10x measurement less useful.
I build things I never would have. My tooling is better and more robust than ever. I verify and test my work better than ever. I fix more bugs than I used to simply because no one needs to care if it fits into a cycle. I explore and solve more problems in more parts of the application, even if I don’t write code. I take better care of our infrastructure. Performance goes up, bugs go down, AWS resources scale back, costs go down. I’ve paid for my AI usage in scaled back resources several times over at this point.
It might not be 10x but it’s a significant multiple.
It's when they practically ignore the rabbit holes where it's suspect. I'm definitely seeing speed ups. I troubleshot a linux system yesterday with minimal effort using a local llm. It likely would have taken me a few hours to locate all the docs & testing procedures. the llm did it with only a few prompts. To ensure it did it correctly, I had to interrogate it a few times before letting it proceed.
Humans make really bad scientists, and it takes a lot of effort to properly catalog and provide statistics for these things.
There is an improvement, but I doubt any random dev can give a real estimate since before LLMs they couldnt really give you a real estimate anyway. I do know when I encounter a bug now, debugging is almost immediately possible.
Direct github link: https://github.com/open-noodle/gallery
Nothing wrong with forks though.
I've always been a backend engineer, never front end. And almost every team I've been on has lacked any front end skills at all, so all our tools end up being a mash of scripts, maybe sometimes an API.
Now we are all front end engineers creating UIs for things we could never do before, and this starts API first development, so the CLI + UI are just calling APIs. Nothing new here, but this used to be what teams do, now a single person does it.
1. I would not have attempted this without AI assistance because it's a big project.
2. I have built a functional program that I am able to use for real work in a handful of weeks, working part time on this (like literally a few hours per day prompting Claude and Kimi).
3. Had I decided to do this without AI assistance it would have been months of work.
https://github.com/KeibiSoft/KeibiDrop
It took me 2 years ago around 2k hours to build a cross platform FUSE vault, without using AI assisted tools.
The pain was debugging through logs and system traces. And understanding how things work.
Now managed to ship this one much faster, as an after hours project. Started it in may 2025, and around end of November 2025 started using claude on it.
Just by dumping logs into claude, and explaining the attack vector for the problems, saved me the FML moments of grindings walls of syscalls on 3 platforms.
I would say much easier to progress, and ship with the same rigour, minimize my time, focus and brain power involvement such that I can put the energy somewhere else.
Trying to fix syntax errors in strong interpolation on a 5-minute-delay loop is hell.
So my agent just listens for green checks and no PR comments and loops until those conditions are met.
Might tend to deviate and waste time, needs guiding once in a while, and to check what is it spewing out, point it in the correct direction.
If I had to output the code myself, would take around 8 hours of constant writing to get around 1k LoC of code. For FUSE level tricky stuff, I might need to spend 3 weeks for 10 LoC. Very easy to burnout and build pain.
Complete frontend + backend + database.
Yes, it is an internal app, but it works and everyone loves it.
Does that count as an example?
(Also I absolutely expect him to need help at some point, but so far it has taken his project from absolutely impossible to 3 weeks of work in between work, renovating his house and being a dad for the first time so I was very impressed.)
We decided to integrate our SaaS into Microsoft Business Central and NetSuite as plugins into those systems. BC has its own programming language, called AL, that has a lot of idiosyncrasies from any other language I've worked with. And NetSuite plugins are written in SuiteScript, which is a custom JS runtime with a ton of APIs to learn.
In the "before", it would've taken 5 developers a year or more to build those integrations. I did both by myself in well under a year. Thank you Claude.
When concrete things like that start to happen, then I will start to believe in the 10x claim.
I don't think we'll see AAA game velocity change until asset generation progresses quite a bit, not to mention stuff like rigging. Even then, there's still a layer between code and engine where you have to wire everything together which an LLM will struggle with.
Replacing some old COBOL is probably more of a management decision based on appetite for change and politics rather than development speed.
Aren't there some measurable things like github repo creation, PRs, app store additions, etc. that can be correlated to LLM adoption? Didn't Show HN have to get throttled after LLMs arrived?
Take LLM out that safe space and suddenly they are no silver bullet, in fact they are unless.
So of course those making the 10x claim mean in the safe space where LLM can handle all activities required. You can’t have it both ways 10x and difficult and confusing tasks for LLMs.
How many people are writing crud apps using mainstream languages vs COBOL though? You don't need 100% silver bullet 1-shot everything, just to recognize the signals that for many use cases, there's a significant shift happening. The safe space is expanding and velocity is increasing.
AI requires a larger amount of fragile resources to work as opposed to an editor, keyboard and a human.
It some sense it’s a bit like the bitcoin revolution that slowed down once transaction times ballooned out. And blockchains didn’t replace databases as expected. Probably for very good reasons: resources required v. results delivered.
I personally agree that AI is great technology for some great new tools. But we still haven’t found its limits: cost v. results. That happened with bitcoins and blockchains is still outstanding for AIs.
So increasing individual output by itself is not enough to affect the argument. It could, if you also reduce the size of people needed for a project, where people are everyone included in the project, not just SWE. But there are strong forces in large orgs to pull toward larger project sizes: budgeting overhead and other similar large orgs optimize for legibility kind of arguments.
IMO the only way this will change is when new companies will challenge existing big guys. I think AI will help achieve this (e.g. agentic e-commerce challenging the existing players), but it will take time.
At _this_ moment, AI is in the state of producing things - if you like with factor 10 or more. But what will come afterwards, when all this mush of code shall create _reliable_ results. This means not man month then, rather man years or decades to fix this billion and maybe trillions lines of opaque probabilistic LOC. You have to take the mean of these two stages, if nothing qualitative happens to the models.
I’m being glib, but there’s a whole class of software (eg simple crud apps) that just don’t have any marginal value anymore. So it doesn’t matter if it’s 10X faster or 100X faster. 100 x $0 is still 0.
Which is what I’m seeing at my job. All of these “afternoon vibe code” projects never actually get users because everyone just vibe-codes their own.
The things that the software does might have value, but the marginal utility of your software is effectively 0.
But after people's expectations adjusted it was just back on the treadmill.
I don't think we've found a new steady-state yet, but I have some gut feeling guesses about where it's going to be.
In my experience stuff like RAILS had negligible impact in my field because companies would always require solid backup from some big name vendor (MS, Oracle, IBM, Sun - back in the day, or even SAP).
So most if not all the smaller silver bullets did not even make a blimp on the radar... and stuff like Java or .NET, while definitely better than C or COBOL... did not really deliver in terms of productivity boost (in part because, as noted in the message I am answering to, expectations kept growing at the same pace)
I remember when coding was free as in beer and freedom!
Clearly..it still wasn't a silver bullet. Because output as a metric is a bad one. I thought it was only one managers valued..but apparently Anthropic has convinced devs to value it finally? i guess it def hits that dopamine receptor hard.
We also seem to fall into these ruts of not understanding what is meant by labor productivity. When an economist is presenting the common outputs / inputs measure, they don't mean raw quantity of output. They're talking about the value added by outputs divided by the value of inputs. Churning out software faster that doesn't earn anybody additional revenue is not making us more productive. It's disheartening that even c-suites with business education don't seem to understand this. That's not to say there is no productivity gain. Plenty of AI-adjacent hyperscalers are seeing ridiculous growth right now, but no non-startup is seeing revenue 10x what it was the year before, not even NVIDIA.
A lot of this is just basic diminishing marginal utility. There is only so much value to be added. Software is usually either a semi-automated controller or human decision making augmenter to some kind of physical manufacturing process, or entertainment, when we talk about ultimately delivering value. Everything else is an intermediate input. We can only be so entertained. For physical goods, we have food, space, clothing seemingly at a sufficient level to satisfy just about everyone, with the reasons for value not being maximized having to do with distribution. Unless your software manages to solve borders, bigotry, cultural incompatibility, poverty, mental illess, physical illness, violence, I'm not sure what the other big rocks are. Software can absolutely be a key part of infrastructure to facilitate distribution. That's exactly what the Internet is, along with all the backend business and logistics systems out there in existence. But without hitting the true big rocks, where is the 10x value supposed to come from? We're talking incremental gains simply because we're not in the dark ages any more and incremental gains are all that's there. Short of Star Trek style replicators and transporters, I'm not sure what could realistically multiply global value by 10.
Without the value, then sure, you may be churning out 10 times as many discrete projects used by at least one person, or 10 times as many lines of code, but that was never the point. Your personalized notes and grocery ordering apps you share with your wife might excite you for a few weeks, but I can assure you they aren't going to revolutionize your life.
When I measure software dev, delivery of code isn't even a metric I care about. It is a key part of the process, to be sure, but I care about results - Did we ship? Did it work? Do we have happier customers and a smaller bug list?
In my experience, while I can answer "yes" to those questions on people who use AI assistance surgically, applying it where its strengths lie... I can answer an emphatic "No" for the teams I've worked with who are "AI-first", making the AI usage itself part of their goals.
I too can vastly increase my speed of development when I stop caring about the quality.
Features are harder to show the limits of, but have you ever had a client or boss who didn't know what they wanted, they just kept asking for stuff? 100 sequential tickets to change the contrast of some button can be closed in record time, but the final impact is still just the final one of the sequence.
Or have you experienced bike-shedding* from coworkers in meetings? It doesn't matter what metaphorical colour the metaphorical bike shed gets painted.
Or, as a user, had a mandatory update that either didn't seem to do anything at all, or worse moved things in the UX around so you couldn't find features you actually did use? Something I get with many apps and operating systems; I'd say MacOS's UX peaked back when versions were named after cats. Non-UX stuff got better since then, but the UX (even the creation of SwiftUI as an attempt to replace UIKit and AppKit) feels like it was CV-driven development, not something that benefits me as a user.
You can add a lot of features and close a lot of tickets while adding zero-to-negative business value. When code was expensive, that cost could be used directly as a reason to say "let's delay this"; now you have to explain more directly to the boss or the client why they're asking for an actively bad thing instead of it being a replacement of an expensive gamble with a cheap gamble. This is not something most of us are trained to do well, I think. Worse, even those of us who are skilled at that kind of client interactions, the fact of code suddenly being cheap means that many of us have mis-trained instincts on what's actually important, in exactly the way that those customers and bosses should be suspicious of.
Extreme example, but exemplifies point
there are entire C corps of monkeys out there
If you’re 10x more productive, someone is willing to pay you 10x as much as they were last year, because you’re producing 10x as much value as before.
Has your salary increased 10x?
That's too simplistic because the rest of the economy isn't static. Everyone is getting access to AI tooling, if the whole field gets a productivity increase then the baseline changes, you don't just become 10x more valuable. The previous work is now way less valuable than it was before. It's also not clear to me that the productivity gains from AI convert 1:1 into profit gains
You should expect this to be reflected in the labour market somewhere. Maybe not your own salary, but in somebody’s salary.
Lean development theory teaches us that in a multi-workstream, multi-stage development process, developers should be kept at roughly 65-75% utilization. Otherwise, contra-intuitively, work queue lengths increase exponentially the closer utilization gets to 100%. The reason is that slack in the system absorbs and smooths perturbations and variability, which are inevitable.
Furthermore, underutilization is also highly comparable to stock market options: their value increases as variability increases. Slack enables quick pivots with less advance notice. It builds continuous responsiveness into the system. And as the Agile Manifesto tells us, excellent software development is more characterized by the ability to respond to change than the mere ability to follow a plan. Customers appreciate responsiveness from software vendors; it builds trust, which is increasing in value all the more with the rise of AI.
But AI-driven development threatens to increase, not decrease individual engineer utilization. More is expected, more is possible, and frankly, once you learn how to guardrail the AI and give it no trust to design well analytically, the speed a senior engineer can achieve while writing great code with AI assistance often feels intoxicating.
I think we're going to go through a whole new spat of hard, counterintuitive lessons similar to those many 1960s and 70s developers like Fred Brooks and his IBM team learned the hard way.
What this article explains is why despite your feelings of untouchable success, on average the experience of using software just keeps getting worse and worse and worse, making this the worst era for software quality that I've ever lived through
Didn't we already do this with every company looking to hire "rockstar programmers"? I don't recall that that ended well.
I don't think anyone has really wrestled with the implications of that yet - we've started talking about "deskilling" and "congnitive debt" but mostly in the context of "programmers are going to forget how to structure code - how to use the syntax of their languages, etc et etc)." I'm not worried about that as it's the same sort of thing we've seen for decades - compilers, higher-order languages, better abstracts, etc etc etc.
The fact that LLMs are able to wrestle with essential complexity means that using them is going to push us further and further from the actual problems we're trying to solve. Right now, it's the wrestling with problems that helps us understand what those problems are. As our organizations adopt LLMs that are able to take on _those_ problems - that is, customer problems, not problems of data, scaling, and so forth - will we hit a brick wall where we lose that understanding? Where we keep shipping stuff but it gets further and further from what our customers need? How do we avoid that?
> AI is the silver bullet - my output is genuinely 10X what it was before claude code existed.
Those are not the same.
You can add 5 different features to a project and still provide less value that the 5 lines diff that resolves a performance bottleneck.
I don't know if, overall, it's a 10x improvement or 6x or 14x but it's a serious contender. Part of it is the LLMs are very uneven in their performance across domains. If all I build is simple landing pages, it might be a 100x improvement. If I work on more complex, proprietary work where there aren't great examples in the training data then it might be a 10% improvement (it helps me write better comments or something)
You still have to read the output of your LLM. Learning by reading alone and not doing is not nearly as effective.
Also, I know that there will be a lot of boilerplate applications that just don't look good or seem to have been well thought out early on.
Folks will use that as a cope mechanism, but huge changes are coming.
The premise is that the software development had been mostly "essential complexity" rather than "accidental complexity." But I think anyone who worked as SE in the past decade would have found the opposite is true.
It's not only that software development is full of accidental complexity. Programmers (and the decision makers above them) have always been actively creating accidental complexity. Making a GUI program hasn't gotten easier since Visual Basic. In fact for each JavaScript framework and technique that wraps around DOM render engine, it has got harder over years. Until LLMs made it easier again (by creating a permanent dependency on LLMs. If you intend to edit the code manually afterwards, it became even harder!)