One of two main reasons why I'm wary of LLMs. The other is fear of skill atrophy. These two problems compound. Skill atrophy is less bad if the replacement for the previous skill does not depend on a potentially less-than-friendly party.
It was an experiment to see if I could enter a mature codebase I had zero knowledge of, look at it entirely through an AI, and come to understand it.
And it worked! Even though I've only worked on the codebase through Claude, whenever I pick up a ticket nowadays I know what file I'll be editing and how it relates to the rest of the code. If anything, I have a significantly better understanding of the codebase than I would without AI at this point in my onboarding.
I’ve been 90% vibe coding for a year or so now, and I’ve learned so much about networking just from spinning up a bunch of docker containers and helping GPT or Claude fix niggling issues.
I essentially have an expert (well, maybe not an expert but an entity far more capable than I am on my own) who’s shoulder I can look over and ask as many questions I want to, and who will explain every step of the process to me if I want.
I’m finally able to create things on my computer that I’ve been dreaming about for years.
I usually learn way more by having Claude do a task and then quizzing it about what it did than by figuring out how to do it myself. When I have to figure out how to do the thing, it takes much more time, so when I'm done I have to move on immediately. When Claude does the task in ten minutes I now have several hours I can dedicate entirely to understanding.
When you have a headache, do you avoid taking ibuprofen because one day it may not be available anymore? Two hundred years ago, if you gave someone ibuprofen and told them it was the solution for 99% of the cases where they felt some kind of pain, they might be suspicious. Surely that's too good to be true.
But it's not. Ibuprofen really is a free lunch, and so is AI. It's weird to experience, but these kinds of technologies come around pretty often, they just become ubiquitous so quickly that we forget how we got by without them.
If that happened at this point, it would be after societal collapse.
Every now and then I pause before I ask an LLM to undo something it just did or answer something I know it answered already, somewhere. And then I remember oh yeah, it's an LLM, it's not going to get upset.
My syntax writing skills may well be atrophying, but I'll just do a leetcode by hand once in a while.
>I have a significantly better understanding of the codebase than I would without AI at this point in my onboarding
One of the pitfalls of using AI to learn is the same as I'd see students doing pre-AI with tutoring services. They'd have tutors explain the homework to do them and even work through the problems with them. Thing is, any time you see a problem or concept solved, your brain is tricked into thinking you understand the topic enough to do it yourself. It's why people think their job interview questions are much easier than they really are; things just seem obvious when you've thought about the solution. Anyone who's read a tutorial, felt like they understood it well, and then struggled for a while to actually start using the tool to make something new knows the feeling very well. That Todo List app in the tutorial seemed so simple, but the author was making a bunch of decisions constantly that you didn't have to think about as you read it.
So I guess my question would be: If you were on a plane flight with no wifi, and you wanted to do some dev work locally on your laptop, how comfortable would you be vs if you had done all that work yourself rather than via Claude?
Probably about as comfortable as I would be if I also didn't have my laptop and instead had to sketch out the codebase in a notebook. There's no sense preparing for a scenario where AI isn't available - local models are progressing so quickly that some kind of AI is always going to be available.
I've worked with people who will look at code they don't understand, say "llm says this", and express zero intention of learning something. Might even push back. Be proud of their ignorance.
It's like, why even review that PR in the first place if you don't even know what you're working with?
A good dev would've read deeper into the concern and maybe noticed potential flaws, and if he had his own doubts about what the concern was about, would have asked for more clarification. Not just feed a concern into AI and fling it back. Like please, in this day and age of AI, have the benefit of the doubt that someone with a concern would have checked with AI himself if he had any doubts of his own concern...
I spent years cultivating expertise in C++ and .NET. And I found that time both valuable and enjoyable. But that's because it was a path to solve problems for my team, give guidance, and do so with both breadth and depth.
Now I focus on problems at a higher level of abstraction. I am certain there's still value in understanding ownership semantics and using reflection effectively, but they're broadly less relevant concerns.
And no, I don't understand them at all. Taking responsibility for something, improving it, and stewarding it into production is a fantastic feeling, and much better than reading the comment section. :)
We have gone multi cloud disaster recovery on our infrastructure. Something I would not have done yet, had we not had LLMs.
I am learning at an incredible rate with LLMs.
But I’m so much more detached of the code, I don’t feel that ‘deep neural connection’ from actual spending days in locked in a refactor or debugging a really complex issue.
I don’t know how a feel about it.
Sure, you don't know the code by heart, but people debugging code translated to assembly already do that.
The big difference is being able to unleash scripts that invalidate enormous amount of hypothesis very fast and that can analyze the data.
Used to do that by hand it took hours, so it would be a last resort approach. Now that's very cheap, so validating many hypothesis is way cheaper!
I feel like my "debugging ability" in terms of value delivered has gone way up. For skill, it's changing. I cannot tell, but the value i am delivering for debugging sessions has gone way up
But if you don't and there's no PR process (side projects), the motivation to form that connection is quite low.
No, because you can get LLMs to produce high quality code that has gone through an infinite number of refinement/polish cycles and is far more exhaustive than the code you would have written yourself.
Once you hit that point, you find yourself in a directional/steering position divorced from the code since no matter what direction you take, you'll get high quality code.
You very much decide how you employ LLMs.
Nobody are keeping a gun to your head to use them. In a certain way.
Sonif you use them in a way that increase you inherent risk, then you are incredibly wrong.
I understand why a designer might read this post and not be happy about it. If you don't think your management values or appreciates design skill, you'd worry they're going to glaze over the bullet points about design productivity, and jump straight to the one where PMs and marketers can build prototypes and ignore you. But that's not what the sales pitch is focused on.
Neither of those is necessarily a synonym for why you personally use them
If you don't know whats going on through the whole process, good luck with the end product.
This all bumps up against the fact that most people default to “you use the tool wrong” and/or “you should only use it to do things where you already have firm grasp or at least foundational knowledge.”
It also bumps against the fact that the average person is using LLM’s as a replacement for standard google search.
The latent assumption here is that learning is zero sum.
That you can take a 30 year old from 1856 bring them into present day and they will learn whatever subject as fast as a present day 20 year old.
That teachers doesn't matter.
That engagement doesn't matter.
Learning is not zero sum. Some cultural background makes learning easier, some mentoring makes is easier, and some techniques increases engagement in ways that increase learning speed.
That’s product atrophy, not skill atrophy.
Could you do it again without the help of an LLM?
If no, then can you really claim to have learned anything?
And yes. If LLMs disappear, then we need to hire a lot of people to maintain the infrastructure.
Which naturally is a part of the risk modeling.
Not what I asked, but thanks for playing.
> Could you do it again without the help of an LLM?
Well, yes?
What do you think "learning" means? If you cannot do something without the teacher, you haven't learned that thing.
If your child says they've learned their multiplication tables but they can't actually multiply any numbers you give them do they actually know how to do multiplication? I would say no.
It’s quite possible to be deep into solving a problem with an LLM guiding you where you’re reading and learning from what it says. This is not really that different from googling random blogs and learning from Stack Overflow.
Assuming everyone just sits there dribbling whilst Claude is in YOLO mode isn’t always correct.
> Could you do it again on your own?
Can you you see how nonsensical your stance is? You're straight up accusing GP of lying they are learning something at the increased rate OR suggesting if they couldn't learn that, presumably at the same rate, on they own, they're not learning anything.
That's not very wise to project your own experiences on others.
Not everyone learns at the same pace and not everyone has the same fault tolerance threshold. In my experiencd some people are what I call "Japanese learners" perfecting by watching. They will learn with AI but would never do it themselves out of fear of getting something wrong while they understand most of it, others that I call "western learners" will start right away and "get their hands dirty" without much knowledge and also get it wrong right away. Both are valid learning strategies fitting different personalities.
What an interesting paradox-like situation.
Well, if internet is down, so is our revenue buddy. Engineering throughput would be the last of our concerns.
I don't believe it. Having something else do the work for you is not learning, no matter how much you tell yourself it is.
Having other people do work for you is how people get to focus on things they actually care about.
Do you use a compiler you didn't write yourself? If so can you really say you've ever learned anything about computers?
Open your eyes, and you might become a believer.
Indeed, quite weird and no imagination.
It does seem like there is a cult of people who categorically see LLMs as being poor at anything without it being founded in anything experience other than their 2023 afternoon to play around with it.
Can’t you be satisfied with outcompeting “non believers”? What motivates you to argue on the internet about it? Deep down are you insecure about your reliance on these tools or something, and want everyone else to be as well?
It feels so off rebuilding serious SaaS apps in days for production, only to be told it is not possible?
And not even just understanding, but verifying that they’ve implemented the optimal solution.
When future humans rediscover mathematics.
And don’t get me started on memory management. Nobody even knows how to use malloc(), let alone brk()/mmap(). Everything is relying on automatic memory management.
I mean when was the last time you actually used your magnetized needle? I know I am pretty rusty with mine.
Yeah, exactly.
It’s like saying clothing manufacturers are paying the “loom tax” tax when they could have been weaving by hand…
Where producing 2x the t-shirts will get you ~2x the revenue, it's quite unlikely that 10x the code will get you even close to 2x revenue.
With how much of this industry operates on 'Vendor Lock-in' there's a very real chance the multiplier ends up 0x. AI doesn't add anything when you can already 10x the prices on the grounds of "Fuck you. What are you gonna do about it?"
Open source libraries and projects together with open source AI is the only way to avoid the existential risks of closed source AI.
I don't know about 10x, but this could only happen if PMs suddenly got really lazy or the engineers actually got at least 1.5x faster. My gut says it's way more because we're now also consistently up to date on our dependencies and completing massive refactors we were putting off for years.
There are lots of reasons this could be the case. Quality suddenly changed, the nature of the work changed, engineers leveled up... But for this to have happened consistently across a bunch of engineering teams is quite the coincidence if not this one thing we are all talking about.
The evangelists told us 20 years ago that if we weren't doing TDD then we weren't really professional programmers at all. The evangelists told us 10 years ago that if we were still running stuff locally then we must be paying a fortune for IT admin or not spending our time on the work that mattered. The evangelists this week tell us that we need to be using agents to write all our code or we'll get left in the dust by our competitors who are.
I'm still waiting for my flying car. Would settle for some graphics software on Linux that matches the state of the art on Windows or even reliable high-quality video calls and online chat rooms that don't make continental drift look fast.
This doesn't happen. Literally zero evidence of this.
If the actual rate is .9x then it matters a lot.
Or even if it's like 1.1x, is the cost worth the return?
Would it matter?
Frontier labs are incentivized to keep it that way, and they're investing billions to make AI = API the default. But that's a business model, not a technical inevitability.
ive had to like tune out of the LLM scene because it's just a huge mess. It feels impossible to actually get benchmarks, it's insanely hard to get a grasp on what everyone is talking about, bots galore championing whatever model, it's just way too much craze and hype and misinformation. what I do know is we can't keep draining lakes with datacenters here and letting companies that are willing to heel turn on a whim basically control the output of all companies. that's not going to work, we collectively have to find a way to make local inference the path forward.
everyone's foot is on the gas. all orgs, all execs, all peoples working jobs. there's no putting this stuff down, and it's exhausting but we have to be using claude like _right now_. pretty much every company is already completely locked in to openai/gemini/claude and for some unfortunate ones copilot. this was a utility vendor lock in capture that happened faster than anything ive ever seen in my life & I already am desperate for a way to get my org out of this.
I get choice paralysis when you show me a prompt box-- I don't know what I can reasonably ask for and how to best phrase it, so I just panic. It doesn't help when we see articles saying people are getting better outcomes by adding things like "and no bugs plz owo"
I'm sure this is by design-- anything with clear boundaries and best practices would discourage gacha style experimentation. Can you trust anyone who sells you a metered service to give you good guidance on how to use it efficiently?
i don't know how else to phrase it: this feels like such an unstable landscape, "beta" software/services are running rampant in every industry/company/org/etc and there's absolutely no single resource we can turn to to help stay ahead of & plan for the rapidly-evolving landscape. every, and i mean every company, is incredibly irresponsible for using this stuff. including my own. once again though, cat's already out of the bag. now we fight for our lives trying to contain it and ensure things are well understood and implemented properly...which seems to be the steepest uphill battle of my life
But it requires that one does not do something stupid.
Eg. For recurring tasks: keep the task specification in the source code and just ask Claude to execute it.
The same with all documentation, etc.
I've said it before and I'll say it again, local models are "there" in terms of true productive usage for complex coding tasks. Like, for real, there.
The issue right now is that buying the compute to run the top end local models is absurdly unaffordable. Both in general but also because you're outbidding LLM companies for limited hardware resources.
You have a $10K budget, you can legit run last year's SOTA agentic models locally and do hard things well. But most people don't or won't, nor does it make cost effective sense Vs. currently subsidized API costs.
So my point is: If you have the attitude that unless it is the bleeding edge, it may have well not exist, then local models are never going to be good enough. But truth is they're now well exceeding what they need to be to be huge productivity tools, and would have been bleeding edge fairly recently.
Don't you understand that by choosing the best model we can, we are, collectively, step by step devaluating what our time is worth? Do you really think we all can keep our fancy paychecks while keep using AI?
There were always jobs that required those "many more skills" but didn't require any programming skills.
We call those people Business Analysts and you could have been doing it for decades now. You didn't, because those jobs paid half what a decent/average programmer made.
Now you are willingly jumping into that position without realising that the lag between your value (i.e. half your salary, or less) would eventually disappear.
Early last year or late last year?
opus 4.5 was quite a leap
I fear that this may not be feasible in the long term. The open-model free ride is not guaranteed to continue forever; some labs offer them for free for publicity after receiving millions in VC grants now, but that's not a sustainable business model. Models cost millions/billions in infrastructure to train. It's not like open-source software where people can just volunteer their time for free; here we are talking about spending real money upfront, for something that will get obsolete in months.
Current AI model "production" is more akin to an industrial endeavor than open-source arrangements we saw in the past. Until we see some breakthrough, I'm bearish on "open models will eventually save us from reliance on big companies".
If you mean obsolete in the sense of "no longer fit for purpose" I don't think that's true. They may become obsolete in terms of "can't do hottest new thing" but that's true of pretty much any technology. A capable local model that can do X will always be able to do X, it just may not be able to do Y. But if X is good enough to solve your problem, why is a newer better model needed?
I think if we were able to achieve ~Opus 4.6 level quality in a local model that would probably be "good enough" for a vast number of tasks. I think it's debatable whether newer models are always better - 4.7 seems to be somewhat of a regression for example.
Google just released Gemma 4, perhaps that'd be worth a try?
model elo $/M
---------------------------------------
glm-5.1 1538 2.60
glm-4.7 1440 1.41
minimax-m2.7 1422 0.97
minimax-m2.1-preview 1392 0.78
minimax-m2.5 1386 0.77
deepseek-v3.2-thinking 1369 0.38
mimo-v2-flash (non-thinking) 1337 0.24
https://arena.ai/leaderboard/code?viewBy=plot&license=open-s...So far, Qwen 3.6 created a functionally equivalent Golang implementation that works against the flat file backend within the last 2 days. I'm extremely impressed.
I don't know if it is bun related, but in task manager, is the thing that is almost at the top always on CPU usage, turns out for me, bun is not production ready at all.
Wish Zed editor had something like BigPickle which is free to use without limits.
I think companies that are shelling out the money for these enterprise accounts could honestly just buy some H100 GPUs and host the models themselves on premises. Github CoPilot enterprise charges $40 per user per month (this can vary depending on your plan of course), but at this price for 1000 users that comes out to $480,000 a year. Maybe I'm missing something, but that's roughly what you're going to be spending to get a full fledged hosting setup for LLMs.
made a HN post of my X article on the lock-in factor and how we should embrace the modular unix philosophy as a way out: https://news.ycombinator.com/item?id=47774312
My manager doesn't even want us to use copilot locally. Now we are supposed to only use the GitHub copilot cloud agent. One shot from prompt to PR. With people like that selling vendor lock in for them these companies like GitHub, OpenAI, Anthropic etc don't even need sales and marketing departments!
One shoting has a very specific meaning, and agentic workflows are not it?
What is the implied meaning I should understand from them using one shot?
They might refer to the lack of humans in the loop.
I'm still surprised top CS schools are not investing in having their students build models, I know some are, but like, when's the last time we talked about a model not made by some company, versus a model made by some college or university, which is maintained by the university and useful for all.
It's disgusting that OpenAI still calls itself "Open AI" when they aren't truly open.
1. Opencode
2. Fireworks AI: GLM 5.1
And it is SIGNIFICANTLY cheaper than Claude. I'm waiting eagerly for something new from Deepseek. They are going to really show us magic.
If you have HPC or Supercompute already, you have much of the expertise on staff already to expand models locally, and between Apple Silicon and Exo there are some amazingly solutions out there.
Now, if only the rumors about Exo expanding to Nvidia are true..
Training and inference costs so we would have to pay for them.