At work, the devs up the chain now do everything with AI – not just coding – then task me with cleaning it up. It is painful and time consuming, the code base is a mess. In one case I had to merge a feature from one team into the main code base, but the feature was AI coded so it did not obey the API design of the main project. It also included a ton of stuff you don’t need in the first pass - a ton of error checking and hand-rolled parsing, etc, that I had to spend over a week unrolling so that I could trim it down and redesign it to work in the main codebase. It was a slog, and it also made me look bad because it took me forever compared to the team who originally churned it out almost instantly. AI tools are not good at this kind of design deconflicting task, so while it’s easy to get the initial concept out the gate almost instantly, you can’t just magically fit it into the bigger codebase without facing the technical debt you’ve generated.
In my personal projects, I get to experience a bit of the fun I think others are having. You can very quickly build out new features, explore new ideas, etc. You have to be thoughtful about the design because the codebase can get messy and hard to build on. Often I design the APIs and then have Claude critique them and implement them.
I think the future is bleak for people in my spot professionally – not junior, but also not leading the team. I think the middle will be hollowed out and replaced with principals who set direction, coordinate, and execute. A privileged few will be hired and developed to become leaders eventually (or strike gold with their own projects), but everyone in between is in trouble.
If they're handing you broken code call them out on it. Say this doesn't do what it says it does, did you want me to create a story for redoing all this work?
This has to be the most thankless job for the near future. It's hard and you get about as much credit as the worker who cleans up the job site after the contractors are done, even though you're actually fixing structural defects.
And god forbid you introduce a regression bug cleaning up some horrible redundant spaghetti code.
Professionally, I have had almost no luck with it, outside of summarizing design docs or literally just finding something in the code that a simple search might not find: such is this team's code that does X?
I am yet to successfully prompt it and get a working commit.
Further, I will add that I also don't know any ICs personally who have successfully used it. Though, there's endless posts of people talking about how they're now 10x more productive, and everyone needs to do x y an z now. I just don't know any of these people.
Non-professionally, it's amazing how well it does on a small greenfield task, and I have seen that 10x improvement in velocity. But, at work, close to 0 so far.
Of the posts I've seen at work, they typically tend to be teams doing something new / greenfield-ish or a refactor. So I'm not surprised by their results.
I see it generating between 50% to 90% accuracy in both small and large tasks, as in the PRs it generates range between being 50% usable code that a human can tweak, to 90% solution (with the occasional 100% wow, it actually did it, no comments, let's merge)
I also found it to be a skillset, some engineers seem to find it easier to articulate what they want and some have it easier to think while writing code.
Meta, despite competing with these, is open to let their devs use better off the shelf tools.
The suggestions are correct about 40% of the time, so I'm actually surprised when they're right, rather than becoming reliant on them. It saves me maybe 10 minutes a day.
I'm enjoying myself so much. Projects I've been thinking about for years are now a couple of hours of hacking around. I'm readjusting my mental model of what's possible as a single developer. And I'm finally learning Go!
The biggest challenge right now is keeping up with the review workload. For low stakes projects (small single-purpose HTML+JS tools for example) I'm comfortable not reviewing the code, but if it's software I plan to have other people use I'm not willing to take that risk. I have a stack of neat prototypes and maybe-production-quality features that I can't ship yet because I've not done that review work.
I mainly work as an individual or with one other person - I'm not working as part of a larger team.
Where it consistently fails: anything involving the interaction between systems. If a bug spans a queue producer and its consumer, or the fix requires understanding how a frontend state change propagates through API calls to a cache invalidation - the model gives you a confident answer that addresses one layer and quietly ignores the rest. You end up debugging its fix instead of the original issue.
My stack: Claude Code (Opus) for investigation and bug triage in a ~60k LOC codebase, Cursor for greenfield work. Dropped autocomplete entirely after a month - it interrupted my thinking more than it helped.
A couple "win" examples: add in-text links to every term in this paragraph that appears elsewhere on the page, plus corresponding anchors in the relevant page parts. Or, replace any static text on this page with any corresponding dynamic elements from this reference URL.
Lose examples: constant, but edit format glitches (not matching searched text; even the venerable Opus 4.6 constantly screws this up), unnecessary intermediate variables, ridiculously over-cautious exception-handling, failing to see opportunities to isolate repeated code into a function, or to utilize an existing function that exactly implements said N lines of code, etc.
It works really well (using Claude Code and Opus 4.6 primarily). Incremental changes tend to be well done and mostly one-shotted provided I use plan mode first, and larger changes are achievable by careful planning with split phases.
We have skills that map to different team roles, and 5 different skills used for code review. This usually gets you 90% there before opening a PR.
Adopting the tool made me more ambitious, in the sense that it lets me try approaches I would normally discard because of gaps in my knowledge and expertise. This doesn't mean blindly offloading work, but rather isolating parts where I can confidently assess risk, and then proceed with radically different implementations guided by metrics. For example, we needed to have a way to extract redlines from PDF documents, and in a couple of days went from a prototype with embedded Python to an embedded Rust version with a robust test oracle against hundreds of document.
I don't have multiple agents running at the same time working on different worktrees, as I find that distracting. When the agent is implementing I usually still think about the problem at hand and consider other angles that end up in subsequent revisions.
Other things I've tried which work well: share an Obsidian note with the agent, and collaboratively iterate on it while working on a bug investigation.
I still write a percentage of code by hand when I need to clearly visualise the implementation in my head (e.g. if I'm working on some algo improvement), or if the agent loses its way halfway through because they're just spitballing ideas without much grounding (rare occurrence).
I find Elixir very well suited for AI-assisted development because it's a relatively small language with strong idioms.
Mostly using Gemini Flash 3 at a FAANG.
To be clear, this is not vibecoding. I have a strong sense of the architecture I want, and explicitly keep Claude on the desired path much like I would a junior programmer. I also insist on sensible unit and E2E test coverage with every incremental commit.
I will say that after several months of this the signalling between UI components is getting a bit spaghettilike, but that would’ve happened anyway, and I bet Claude will be good at restructuring it when I get around to that.
I also work in a giant Rails monolith with 15 years of accumulated cruft. In that area, I don’t write a whole lot, but CC Opus 4.6 is fantastic for reading the code. Like, ask “what are all the ways you can authenticate an API endpoint?” and it churns away for 5 minutes and writes a nice summary of all four that it found, what uses them, where they’re implemented, etc.
"Implement JWT token verification and role checking in Spring Boot. Secure some endpoints with Oauth2, some with API key, some public."
C# and Java are so old, whatever solutions you find are 5 years out of date. Having an agent implement and verify the foundation is the perfect fit. There's no design, just ever-chaning framework magic. I'd do the same "Google and debug" cycle, but 10 times slower.
Professionally I hardly use the tools for coding, since I’m in an architecture role and mostly write design docs and do reviews. And I write the occasional prototype.
I have started building tools to integrate copilot (Opus) better with $CORP. This way I can ask it questions across confluence and github.
Leveraging Claude for a project feels very addictive to me. I have to make a conscious effort to stop and I end up working on multiple projects at the same time.
I also use it as a final check on all my manually written code before sending it for code review.
With all that said, I have this weird feeling that my ability to quickly understand and write code is no longer noticeable, nor necessary.
Everyone now ships tons of code and even if I do the same without any LLM, the default perception will be that it has been generated.
I am not depressed about it yet, but it will surely take a while to embrace the new reality in its entirety
For throw away code, I might let the agent do some stuff. For example, we needed to test timing on DNS name resolution on a large number of systems to try and track down if that was causing our intermittent failures. I let an agent write that and was able to get results faster than if I did it myself, and I ultimately didn’t have to care about the how… I just needed something to show to the network team to prove it was their problem.
For larger projects that need to plugin to the legacy code base, which I’ll need to maintain for years, I still prefer to do things myself, using AI here and there as previously mentioned to help with little things. It can also help finding bugs more quickly (no more spending hours looking for a comma).
I had an agent refactor something I was making for a larger project. It did it, and it worked, but it didn’t write it in a way that made sense to my brain. I think others on my team would have also had trouble supporting it too. It took something relatively simple and added so many layers to it that it was hard to keep all the context in my head to make simple edits or explain to someone else how it worked. I might borrow some of the ideas it had, but will ultimately write my own solution that I think will be easier for other people to read and maintain.
Borrowing some of these ideas and doing it myself also allows me to continue to learn and grow, so I have more tools in my tool belt. With the DNS thing that was totally vibe coded, there were some new things in there I hadn’t done before. While the code made sense when I skimmed through it, I didn’t learn anything from that effort. I couldn’t do anything it did again without asking AI to do it again. Long-term, I think this would be a problem.
Other people on my team have been using AI to write their docs. This has been awful. Usually they don’t write anything at all, but at least then I know they didn’t writing anything. The AI docs are simply wrong, 100% hallucinations. I have to waste time checking the doc against the code to figure that out and then go to the person that did it to make them fix it. Sometimes no doc is better than a bad doc.
People having easy access to LLMs makes this job much harder. LLMs can create what looks at the surface like expert-written code, but suffers from below-the-surface issues that will reveal themselves as intermittent issues or subtle bugs after being deployed.
Inexperienced devs create huge commits full of such code, and then expect me to waste an entire day searching for such issues, which is miserable.
If the models don't improve significantly in the future, I expect that most high-stakes software teams will fire all the inexperienced devs and have super-experienced engineers work with the bots directly.
What works:
-Just pasting the error and askig what's going on here.
-"How do I X in Y considering Z?"
-Single-use scripts.
-Tab (most of the time), although that doesn't seem to be Claude.
What doesn't:
-Asking it to actually code. It's not going to do the whole thing and even if, it will take shortcuts, occasionally removing legitimate parts of the application.
-Tests. Obvious cases it can handle, but once you reach a certain threshold of coverage, it starts producing nonsense.
Overall, it's amazing at pattern matching, but doesn't actually understand what it's doing. I had a coworker like this - same vibe.
But even with Opus 4.6 max / GPT 5.4 high it takes time, you need to provide the right context, add skills / subagents, include tribal knowledge, have a clear workflow, just like you onboard a new developer. But once you get there, you can definitely get it to do larger and larger tasks, and you definitely get (at least the illusion) that it "understands" that it's doing.
It's not perfect, but definitely can code entire features, that pass rigorous code review (by more than one human + security scanners + several AI code reviewers that review every single line and ensure the author also understands what they wrote)
another teammate added a length check to an input field, and his request was merged near instantly, even though it had zero unit testing. this team is incredibly cooked in the long term, i just need to ensure that i survive the short term somehow.
As a sidenote, I highly doubt they are cooked longterm. Using AI is not exactly skilled labor. If they want or need I'm sure they could learn patterns/workflows in like an afternoon. As things go on it will only get easier to use.
Like yeah sorry... not everyone has to be a risk-taker. Many people like to observe and await to see what new techniques emerge that can be exploited.
People like you are making sunk expenditures whilst the models are evolving... they can just wait until the models get to 'steady-state' to figure out the optimal workflow. They will have lost out on far less.
Stack: go, python Team size: 8 Experience, mixed.
I'm using a code review agent which sometimes catches a critical big humans miss, so that is very useful.
Using it to get to know a code base is also very useful. A question like 'which functions touch this table' or 'describe the flow of this API endpoint' are usually answered correctly. This is a huge time saver when I need to work on a code base i'm less familiar with.
For coding, agents are fine for simple straightforward tasks, but I find the tools are very myopic: they prefer very local changes (adding new helper functions all over the place, even when such helpers already exist)
For harder problems I find agents get stuck in loops, and coming up with the right prompts and guardrails can be slower than just writing the code.
I also hates how slow and unpredictable the agents can be. At times it feels like gambling. Will the agents actually fix my tests, or fuck up the code base? Who knows, let's check in 5 minutes.
IMO the worst thing is that juniors can now come up with large change sets, that seem good at a glance but then turn out to be fundamentally flawed, and it takes tons of time to review
The manager & a senior dev on my first day told me to "Don't try to write code yourself, you should be using AI". I got encouraged to use spec-driven development and frameworks like superpowers, gsd, etc.
I'm definitely moving faster using AI in this way, but I legitimately have no idea what the fuck I am doing. I'm making PRs I don't know shit about, I don't understand how it works because there is an emphasis on speed, so instead of ramping up in a languages / technologies I've never used, I'm just shipping a ton of code I didn't write and have no real way to vet like someone who has been working with it regularly and actually has mastered it.
This time last year, I was still using AI, but using it as a pair programming utility, where I got help learn to things I don't know, probe topics / concepts I need exposure to, and reason through problems that arose.
I can't control the direction of how these tools are going to evolve & be used, but I would love if someone could explain to me how I can continue to grow if this actually is the future of development. Because while I am faster, the hope seems to be AI / Agents / LLMs will only ever get better and I will never need to have an original thought or use crtical thinking.
I have just about 4 years of professional experience. I had about 10 - 12 months of the start of my career where I used google to learn things before LLMs became sole singular focus.
I wake up every day with existential dread of what the future looks like.
The people forcing it down you do not care about the long-term ramifications.
It’s a lot of fun for exploring ideas. I’ve built things very fast that I would not have done at all otherwise. I have rewritten a huge chunk of semi-outdated docs into something useful with a couple of Prompts in a day. Claude does all the annoying dependency update breaks the build kinds of things. And the reviews are extremely useful and a perfect combination with human review as they catch things extremely well that humans are bad at catching.
But in the production codebase changes must be made with much more consideration. Claude tends to perform well ob some tasks but for other I end up wasting time because I just don’t know up front how the feature must look so I cannot write a spec at the level of precision that claude needs and changing code manually is more efficient for this kind of discovery for me than dealing with large chunks of constantly changing code.
And then there’s the fact that claude produces things that work and do the thing described in the prompt extremely well but they are always also wring in sone way. When I let AI build a large chunk of code and actually go through the code there’s always a mess somewhere that ai review doesn’t see because it looks completely plausible but contains some horrible security issue or complete inconsistency with the rest of the codebase or, you know, that custom yaml parser nobody asked for and that you don’t want your day job to depend on.
Claude recently tried to replace a html sanitizer with a custom regex that perfectly fit all our tests as well as the spec I wrote
I'm learning all the time and it's fun, exasperating, tremendously empowering and very definitely a new world.
It may sound absurd to review an implementation with the same model you used to write it, but it works extremely well. You can optionally crank the "effort" knob (if your model has one) to "max" for the code review.
I just got started using Claude very recently. I have not been in the loop how much better it got. Now it's obvious that no one will write code by hand. I genuinely fear for my ability to make a living as soon as 2 years from now, if not sooner. I figure the only way is to enter the red queen race and ship some good products. This is the positive I see. If I put 30h/week into something, I have productivity of 3 people. If it's a weekend project at 10h/week, I have now what used to be that full week of productivity. The economics of developing products solo have vastly changed for the better.
I have to very much be in the loop and constantly guiding it with clarifying questions but it has made running multiple projects in parallel much easier and has handled many tedious tasks.
I find it the most exciting time for me as a builder, I can just get more things done.
Professionally, I'm dreading for our future, but I'm sure it will be better than I fear, worse than I hope.
From a toolset, I use the usual, Cursor (Super expensive if you go with Opus 4.6 max, but their computer use is game changing, although soon will become a commodity), Claude code (pro max plan) - is my new favorite. Trying out codex, and even copilot as it's practically free if you have enterprise GitHub. I'm going to probably move to Claude Code, I'm paying way too much for Cursor, and I don't really need tab completion anymore... once Claude Code will have a decent computer use environment, I'll probably cancel my Cursor account. Or... I'll just use my own with OpenClaw, but I'm not going to give it any work / personal access, only access to stuff that is publicly available (e.g. run sanity as a regular user). Playing with skills, subagents, agent teams, etc... it's all just markdowns and json files all the way down...
About our professional future:
I'm not going to start learning to be a plumber / electrician / A/C repair etc, and I am not going to recommend my children to do so either, but I am not sure I will push them to learn Computer Science, unless they really want to do Computer Science.
What excites me the most right now is my experiments with OpenClaw / NanoClaw, I'm just having a blast.
tl;dr most exciting yet terrifying times of my life.
While the final impact LLMs will have is yet to be determined (the hype cycle has to calm down, we need time to see impacts in production software, and there is inevitably going to be some kind of collapse in the market at some point), its undoubtable that it will improve overall productivity (though I think it's going to be far more nuanced then most people think). But with that productivity improvement will come a substantial increase in complexity and demand for work. We see this playout every single time some tool comes along and makes engineers in any field more productive. Those changes will also take time, but I suspect we're going to see a larger number of smaller teams working on more projects.
And ultimately, this change is coming for basically all industries. The only industries that might remain totally unaffected are ones that rely entirely on manual labor, but even then the actual business side of the business will also be impacted. At the end of the day I think it's better to be in a position to understand and (even to a small degree) influence the way things are going, instead of just being along for the ride.
If the only value someone brings is the ability to take a spec from someone else and churn out a module/component/class/whatever, they should be very very worried right now. But that doesn't describe a single software engineer I know.
I run a small lab that does large data analytics and web products for a couple large clients. I have 5 developers who I manage directly, I write a lot of code myself and I interact directly with my clients. I have been a web developer for long enough to have written code in coldfusion, php, asp, asp.net, rails, node and javascript through microsoft frontpage exports, to jquery,to backbone, angular and react and in a lot of different frameworks. I feel this breadth of watching the internet develop in stages has given me a decent if imperfect understanding of many of the tradeoffs that can be made in developing for the web.
My work lately is on an analytics / cms / data management / gis platform that is used by a couple of our clients and we've been developing for a couple of years before any ai was used on it all. Its a react front end built on react-router-7 that can be SPA or SSR and a node api server.
I had tried AI coding a couple times over the past few years both for small toy projects and on my work and it felt to me less productive than writing code by hand until this January when I tried Claude Code with Opus 4.5. Since then I have written very few features by hand although I am often actively writing parts of them, or debugging by hand.
I am maybe in a slightly unique place in that part of my job is coming up with tasks for other developers and making sure their code integrates back, I've been doing this for 10 years plus, and personally my sucess rate with getting someone to write a new feature that will get use is maybe a bit over 50%, that is maybe generous? Figuring out what to do next in a project that will create value for users is the hard part of my job whether I am delegating to developers or to an AI and that hasn't changed.
That being said I can move through things significantly faster and more consistently using AI, and get them out to clients for testing to see if they are going to work. Its also been great for tasks which I know my developers will groan if I assign to them. In the last couple months I've been able to
- create a new version of our server that is free from years of cruft of the monorepo api we use across all our projects. - implement sqlite compatablity for the server (in addition to original postgres support) - Implement local first sync from scratch for the project - Test out a large number of optimization strategies, not all of which worked out but which would have taken me so much longer and been so much more onerous the cost benefit ratio of engaging them would have been not worth it - Tons of small features I would have assigned to someone else but are now less effort to just have the AI implement.
I think the biggest plus though is the amount of documentation that has accrued in our repo since using starting to use these tools. I find AI is pretty great at summarizing different sections of the code and with a little bit of conversation I can get it more or less exactly how I want it. This has been hugely useful to me on a number of occasions and something I would have always liked to have been doing but on a small team that is always under pressure to create results for our clients its something that didn't cross the immediate threshold of the cost benefit ratio.
In my own use of AI, I keep the bottleneck at my own understanding of the code, its important to me that I maintain a thorough understanding of the codebase. I couple possibly go faster by giving it a longer leash, but that trade off doesn't seem wise to me at this point, first because I'm already moving so much faster than I was very recently and secondly because it doesn't seem very far from the next bottleneck which is deciding what is the next useful thing to implement. For the most part, I find the AI has me moving in the right direction almost all the time but I think this is partly for me because I am already practiced in communicating the programmers what to implement next and I have a deep understanding of the code base, but also because I spend more than half of the time using AI adding context, plans and documentation to the repo.I have encouraged my team to use these tools but I am not forcing it down anyone's throat, although its interesting to give people tasks that I am confident I could finish much quicker and much more to my personal taste than assigning it. The reactions from my team are pretty mixed, one of the strongest contributors doesn't find a lot of gains from it. One has found similar productivity gains to myself, others are very against it and hate it.
I think one of the things it will change for me is, I can no longer just create the stories for everyone, learning how to choose on what to work on is going to be the most important skill in my opinion so over the next couple months I am going to be shifting so everyone on my team has direct client interactions and I am going to try to shift away from writing stories to having meetings where I help them decide on their own what to work on. Still part of the reason that I can afford to do this is because I can now get as much or more work done than I was able to with my whole team at this time last year.
That's a big difference in one way, and I am optimistic that the platform I am working on will be a lot better and able to compete with large legacy platforms that it wouldn't have been able to compete with in the past, but still it just tightens the loop of trying new things and getting feedback and the hardest part of the business is still communication with clients and building relationships that create value.
But in the end, the thing that pisses me off was a manager who used it to write tickets. If the product owner doesn't give a shit about the product enough to think and write about what they want, you'll never be successful as a developer.
Otherwise it's pretty cool stuff.
One of the things that has helped the most is all the documentation I wrote inside the repository before I started using AI. It was intended for consumption by other engineers, but I think Cursor has consumed it more than any human. I've even managed to make improvements not by having AI update it, but asking AI "What unanswered questions do you have based on reading the documentation?" It has helped me fill in gaps and add clarity.
Another thing I've gotten a ton of value with is having it author diagrams. I've had it create diagrams with both the mermaid syntax and AWSDAC (Diagram-as-Code). I've always found crafting diagrams a painstaking process. I have it make a first pass by analyzing my code + configuration, then make corrections and adjustments by explaining the changes I want.
In my own PRs, I have been in the habit of posting my Cursor Plan document and Transcript so that others can learn from it. I've also encouraged other team members to do the same.
I feel bad for any teams that are being mandated to use a certain amount of AI. It seems to me that the only way to make it work is by having teams experiment with it and figure out how to best use it given their product and the team's capacity. AI is like a pair of Wile-E-Coyote rocket skates. It'll get you somewhere fast, but unless you've cleared the road of debris and pointed in exactly the right direction, you're going to careen off a cliff or into a wall.
This fellow is one of the few mature software engineers I have ever met who is rigorously and consistently productive in a very challenging mature code base year in and year out. or WAS .. yes this is from coughgooglecough in California
https://repo.autonoma.ca/treetrek
There's still some work to do on the rendering side of model objects. Developing the syntax highlighting rules for 40 languages and file formats in about 10 minutes was amazing to see.
https://repo.autonoma.ca/repo/treetrek/tree/HEAD/render/rule...
Edit, great example. What is your long term maintenance strategy, do you keep the original prompts around so you can refine them later or do you dig into the source?
Would love to see more of your workflow.