I do wonder how much all the engineering put into these coding tools may actually in some cases degrade coding performance relative to simpler instructions and terminal access. Not to mention that the monthly subscription pricing structure incentivizes building the harness to reduce token use. How much of that token efficiency is to the benefit of the user? Someone needs to be doing research comparing e.g. Claude Code vs generic code assist via API access with some minimal tooling and instructions.
I tend to agree about the legacy workarounds being actively harmful though. I tried out Zed agent for a while and I was SHOCKED at how bad its edit tool is compared to the search-and-replace tool in pi. I didn't find a single frontier model capable of using it reliably. By forking, it completely decouples models' thinking from their edits and then erases the evidence from their context. Agents ended up believing that a less capable subagent was making editing mistakes.
just call it something like "[month][year]edition" and work on next release
users spend effort arriving to narrow peak of performace, but every change keeps moving the peak sideways
Well, according to this story, instructions refined by trial and error over months might be good for one LLM on Tuesday, and then be bad for the same LLM on Wednesday.
The constraints of (b) limit them from raising the price, so that means meeting (a) by making it worse, and maybe eventually doing a price discrimination play with premium tiers that are faster and smarter for 10x the cost. But anything done now that erodes the market's trust in their delivery makes that eventual premium tier a harder sell.
And idk about the pricing thing. Right now I waste multiple dollars on a 40 minute response that is useless. Why would I ever use this product?
The background being that we scrapped working on a feature and then started again a sprint later.
In my cynicism I find it more likely that a massively unprofitable LLM company tries to reduce costs at any price than everyone else suffering from a collective delusion.
This is the whole point of AI. Its a black box that they can completely control.
And I hope we will eventually reach a point where models become "good enough" for certain tasks, and we won't have to replace them every 6 months.
(That would be similar to the evolution of other technologies like personal computers and smartphones.)