upvote
It frustrates me too, it really feels like the next breakthrough will be when someone gets agents working "natively" with LSP on large code-bases.

Anthropic added LSP support to claude-code, but the current implementation is worse than useless, because any changes aren't reflected fast enough, so it's constantly working on outdated views / compilation caches, and it gets in a right muddle between its "internal" state / understanding in context, the real-world file, and the LSP.

If it could just leverage LSP to apply refactorings it would be amazing, but it feels like the LSP can't keep up, and I don't know if that's an LSP problem or a claude problem.

So we binned the LSP plugin and we're back to watching a machine find/replace, because while waiting on that is slower than LSP, it's a "Action => Wait" which the tooling understands, while LSP is "Possibly Wait for LSP to catch up => Action" which it doesn't understand nearly as well.

I suspect the LSP plugins also need better skills that pair with them so it reaches for them more often.

It hurts my soul to see it reach for find/replace to rename a class, complete with mistakes made in complex solutions where you might have name clashes in different namespaces. Something the LSP handles without problem, but can trip up an LLM.

reply
I wonder, is the problem here that LSP is updating too slow all the time? Or just that there’s a chance it will update very slow, and you never really know if you’ll hit that chance, so your model always has to do the “long time wait” just in case? It seems like it ought to be possible for LSP to report that it is still processing, in the latter case, somehow…
reply
I'm not an expert, but my reading of the spec is that LSP can handle generic $notifications, but there isn't a specific standard for readiness reporting beyond "Initialize / Initialized", which isn't suitable for monitoring on-going staleness or readiness post-file-detected change, the spec has that as a single first-time initialization.

There are notifications (i.e. `textDocument/didChange` ) that you can send to the LSP to help it along, but again you might end up racing the notification from the client making the change and any file-watchers you might have running.

I suspect the answer will come in the form of some kind of more powerful LSP implementations with generous memory caches so that disk changes are just another buffered input that can be disregarded if already stale, no longer seen as the source of truth, and the LSP becomes the real source of truth, so everything can coordinate through it, operating mostly out of memory.

Another avenue for better success will be more research into faster compilation and better incremental compilation for languages with slower compilation.

Maybe one day we'll even get AI agents directly manipulating syntax trees, and the code to get there being written back as merely a side-effect, but that seems like sci-fi compared to the current state of play. LSP is still very document based, and of course LLMs are also trained on oodles of source.

reply
I work in Unity and I got frustrated with Claude constantly doing gross bash/grep/awk/sed/grep nested loops that took forever that I finally described (and had Claude implement and install) a tool that could, in a single pass, gather all this info from a Unity forest of scenes at once and answer all the questions Claude ever wanted to ask about a Unity project in a single pass that takes 50ms instead of 10 30 second iterations. It still took a lot of coaching to get it to actually use this tool, but it seems like I’ve convinced it.
reply
Haha yep I’m experimenting with Unreal engine and Codex and it spent 10 minutes while I was AFK confidently trying to build a scene. I load it up and fall through the world. I say “can’t you write a tool to screenshot so you know you’ve done a reasonable job correctly?” and now it does that.

It reminds me of working with a junior dev and he was pushing his code to dev, then waiting for it to build for every update because he couldn’t get it to build locally. 5 minutes of my time fixing his config surely saved him hours over the project. He wasn’t a bad dev either!

You have to do a lot of the meta thinking for the agents, because they’ll take an “everything looks like a nail if you have a hammer” with their toolkit.

Writing an entire local generated asset pipeline using flux and hunyuan3D-2.1 was a really fun experience. I’ve done software for years but never game dev and it’s just so much fun even if it’s junky little games to impress my kids and get them involved in the creative process.

reply
Shouldn't it be possible to simply state in the contract to use that tool only? I've had good success with that in my coding.
reply
if it helps, I've found that using context (Claude.md etc) is way less effective for this type of pattern compared to using PreToolHook to capture "bad patterns" and either transparently rewriting them to "do the right thing" if that is possible statically, or if not then rejecting the tool use with a message that tells the agent "how" to use the intended tooling itself.
reply
tool_call is just a fancy wrapper to a black box that executes console commands. Said commands are now the actual backbone of all agentic AI, It feels like the linux people are incredibly vindicated in the single responsibility principle
reply
Codex did take control of chrome to run a skill I’d given it for a website without an API the other day. It can do it but it’s excruciatingly slow compare to the tool calls for sure.
reply