upvote
I use mainly Opus 4.6.

I did the same thing and created a skill for summarizing a troubleshooting conversation. It works decently, as long as my own input in the troubleshooting is minimal. i.e. dangerously-skip-permissions. As soon as I need to take manual steps or especially if the conversation is in Desktop/Web, it will very quickly degrade and just assume steps I've taken (e.g. if it gave me two options to fix something, and I come back saying it's fixed, it will in the summary just kind of randomly decide a solution). It also generally doesn't consider the previous state of the system (e.g. what was already installed/configured/setup) when writing such a summary, which maybe makes it reusable for me, somewhat, but certainly not for others.

Now you could say, "these are all things you can prompt away", and, I mean, to an extent, probably. But once you're talking about taking something like this online, you're not working with the top 1% proompters. The average claude session is not the diligent little worker bee you'd want it to be. These models are still, at their core, chaos goblins. I think Moltbook showed that quite clearly.

I think having your model consider someone else's "fix" to your problem as a primary source is bad. Period. Maybe it won't be bad in 3 generations when models can distinguish noise and nonsense from useful information, but they really can't right now.

reply
Isn’t what you’ve just described - the context bloat problem, the part about the web?

I’m not sure I quite get the same experience as you with the “assumes steps it never took”. Do you think it’s because of the skills you’ve used?

I also disagree that having at least some solution to a similar problem is inherently bad. Usually it directs the LLM to some path that was verified, if we’re talking about skills

reply
The steps they say they took and steps they took are not the same thing.
reply