upvote
Let me give you a counterexample. I'm working on a product for the national market, and i need to do all financial tasks, invoicing, submit to national fiscal databse etc. through a local accounting firm. So i integrate their API in the backend; this is a 100% custom API developed by this small european firm, with a few dozen restful enpoints supporting various accounting operations, and I need to use it programmatically to maintain sync for legal compliance. No LLM ever heard of it. It has a few hundred KB of HTML documentation that Claude can ingest perfectly fine and generate a curl command for, but i don't want to blow my token use and context on every interaction.

So I naturally felt the need to (tell Claude to) build a MCP for this accounting API, and now I ask it to do accounting tasks, and then it just does them. It's really ducking sweet.

Another thing I did was, after a particularly grueling accounting month close out, I've told Claude to extract the general tasks that we accomplished, and build a skill that does it at the end of the month, and now it's like having a junior accountant in at my disposal - it just DOES the things a professional would charge me thousands for.

So both custom project MCPs and skills are super useful in my experience.

reply
Your use is maybe more vanilla than you think. I think you are just getting shit done. Which is good.

Claude and an mcp and skill is plain to me. Writing your own agent connecting to LLMs to try to be better than Claude code, using Ralph loops and so on is the rabbit hole.

reply
What exactly does it do that a professional would charge you thousands for?

(I'm genuinely asking)

reply
its not though if you're working in a massive codebase or on a distributed system that has many interconnected parts.

skills that teach the agent how to pipe data, build requests, trace them through a system and datasources, then update code based on those results are a step function improvement in development.

ai has fundamentally changed how productive i am working on a 10m line codebase, and i'd guess less than 5% of that is due to code gen thats intended to go to prod. Nearly all of it is the ability to rapidly build tools and toolchains to test and verify what i'm doing.

reply
But... plain Claude does that. At least for my codebase, which is nowhere close to your 10m line. But we do processing on lots of data (~100TB) and Claude definitely builds one-off tools and scripts to analyze it, which works pretty great in my experience.

What sort of skills are you referring to?

reply
I think people are looking at skills the wrong way. It's not like it gives it some kind of superpowers it couldn't do otherwise. Ideally you'll have Claude write the skills anyway. It's just a shortcut so you don't have to keep rewriting a prompt all over again and/or have Claude keep figuring out how to do the same thing repeatedly. You can save lots of time, tokens and manual guidance by having well thought skills. Some people use these to "larp" some kind of different job roles etc and I don't think that's productive use of skills unless the prompts are truly exceptional.
reply
If you build up and save some of those scripts, skills help Claude remember how and when to use them.

Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.

reply
This resonates with me. Sometimes I build up some artifacts within the context of a task, but these almost always get thrown away. There are primarily three reason I prefer a vanilla setup.

1. I have many and sometimes contradictory workflows: exploration, prototyping, bug fixing debugging, feature work, pr management, etc. When I'm prototyping, I want reward hacking, I don't care about tests or lint's, and it's the exact opposite when I manage prs.

2. I see hard to explain and quantify problems with over configuration. The quality goes down, it loses track faster, it gets caught in loops. This is totally anecdotal, but I've seen it across a number of projects. My hypothesis is that is related to attention, specifically since these get added to the system prompt, they pull the distribution by constantly being attended to.

3. The models keep getting better. Similar to 2, sometime model gains are canceled out by previously necessary instructions. I hear the anthropic folks clear their claude.md every 30 days or so to alleviate this.

reply
[flagged]
reply
Lots of money being made by luring people into this trap.

The reality is that if you actually know what you want, and can communicate it well (where the productivity app can be helpful), then you can do a lot with AI.

My experience is that most people don't actually know what they want. Or they don't understand what goes into what they want. Asking for a plan is a shortcut to gaining that understanding.

reply
Or, as I like to put it: I need to activate my personal transformers on my inner embeddings space to figure what is it I really want. And still, quite often, I think in terms of the programming language I'm used to and the library I'm familiar with.

So, to really create something new that I care about, LLMs don't help much.

They are still useful for plenty of other tasks.

reply
This. At work I have described this phenomenon as the equivalent of tinkering with the margins and fonts in your word processor instead of just writing your paper.
reply
> Plain Claude, ask it to write a plan, review plan, then tell it to execute still works the best in my experience.

Working on an unspecified codebase of unknown size using unconfigured tooling with unstated goals found that less configuration worked better than more.

reply
Emacs init file bikeshedding comes to mind…
reply
but now you can build your AI agent toolkit to work on your init file for you
reply
My init.el file went from some 300 lines to under 50 with Claude's assistance. Some of that had to do with updating Emacs, but I really only use Emacs for Org mode so that contribution was minimal.
reply
if you work on platforms, frameworks, tools that are public knowledge, then yeah. If there’s nothing unique to your project or how to write code in it, build it, deploy it, operate it, yeah.

But for some projects there will be things Claude doesn’t know about, or things that you repeatedly want done a specific way and don’t want to type it in every prompt.

reply