upvote
I've not done particularly rigorous testing, but I've done this a lot with Claude to get a feel. What I've noticed is for certain open-ended tasks, Claude is extremely primeable: it will pick up on minor differences in wording in your prompt and run with them hard.

It can be frustrating. The AI pretends to be a human, and so a part of my brain expects them to commit and have a "parti pris" like a human, so the exercise is a good reminder of the feedback loop. My mental model is that before the first three or four messages, the model has many finer points of its personality still underdetermined. I'd suggest that as the mechanism for "role-based prompting". And it explains the "savant sleeper agent" thing you describe. You want to get the state in the right attractor on the manifold.

These machines are pretty incredible, but for conversation-driven workflows you really have to be in the driver's seat. A human has a property that the AI does not have, at least under current architectures: we are regulated by the outside world. A bit of a tangent, but I can see how AI psychosis arises from these dynamics.

reply
One thing that I learned when doing raw API LLM usage is how drastically the results can vary call per call with exactly the same input. I think that on average, people using agents underestimate the variation in results from a given turn command are, and so overindex on "X technique worked well" or "if I do Y then this will happen" or even "it did Z task well last time so it will this time too" or "{Model} is great at {thing}"
reply

  > We’ve been calling some of these “magic words” at work, specific technical terms or references/techniques that you need only mention to get vast improvements in outcome.
Any chance you could share some of these? Seems like something we could all benefit from.
reply
Sure, my company has been working on a broad swathe of infrastructure projects and developer tools, which requires prompting models to seek out other tools/apis/docs/examples but in a way where we can't just dump all the context on the model up front. We also need the models to oftentimes look up technical documentation and specs, and sometimes build custom parsers for specific documentation websites that only make the data available embedded across 200+ pages of html.

First, I almost always try to seed every new project or context/domain with canonical technical specifications or examples I found elsewhere. When I set up this project recently, I linked to a bunch of the official Apple docs for sysctl, and told it to use a specific technique for calling assembly code from Go, that from experience it almost never realizes it can do or knows about (and similarly for sysctl, I knew it kinda sorta knew about it, but not in its entirely): https://github.com/accretional/sysctl/commit/da52438233e5b33...

The other thing I did was tell it to enumerate all the test cases ahead of time rather than to just directly implement them; again this is something where you have to explicitly tell it to go digging for information where it has blind spots and get it to set up properly grounded self-eval in a way that it can test against. I usually tell it to take notes as it works or commit notes to itself that will persist over sessions: https://github.com/accretional/sysctl/blob/main/FINDINGS_2.m...

Once we get back to working on this project we'll just have it implement / validate the rest of the sysctl feature support against the full inventory we had it uncover: https://github.com/accretional/sysctl/blob/main/cmd/darwin-n...

Another thing we do is have it specify an API that it can produce against; then in other projects we have them consume the API via reflection (and our special sauce we've been working on is the ability to discover and integrate against these automatically across thousands of APIs from many providers, which we've got working and can share if you're interested in using it as an early customer): https://github.com/accretional/sysctl/blob/main/proto/sysctl... This isn't the greatest example because it doesn't actually fully specify the sysctl keys yet. But I did have it create a knowledge base trying to cover the 1000+ keys as best as it could, to reference as it continued: https://github.com/accretional/sysctl/tree/main/macos-sysctl...

We have a better example in eg https://github.com/accretional/proto-sqlite/tree/main/lang where we were able to encode the entire sqlite grammar into a grpc interface so that you could eg find the exact structure (and sanitize) of a select statement: https://github.com/accretional/proto-sqlite/blob/main/lang/p... This way integration and discovery becomes a matter of telling it "use reflection against this endpoint to discover the sql interface, then implement against it" and we can model formats/input validation as formal grammars via EBNF (all magic words) vs just adhoc

We also tell it to set up and use a browser automation toolkit/testing and always run it at the end of testing workflows (often in a way that auto-opens screenshots on our local machines + commits them to git) via tools like https://github.com/accretional/chromerpc#headlessbrowser-aut... so that whenever we produce UIs it can evaluate its own output and iterate without direct human intervention. This is another case where the knowledge-discovery problem becomes a problem so we tell the models to use reflection to discover the browser automation apis. That ends up giving us things like this where it records user journeys through sites and creates visualizations without us having to debug them or do them ourselves: https://github.com/accretional/proto-css/tree/main/chrome-te...

reply
Thank you very much. I'm going to re-read this evening. Have a great day!
reply
If the benefits of using the model you've come to know well outweigh the disadvantages, you can continue using it even after the release of a successor model, right?
reply
Yes! That's exactly true. I have a very real experience on this. I got introduced to Anthropic's family of models with Claude3.5. I fell in love with the specific personality of Sonnet, the model. I can't remember if back then Opus wasn't public yet but I remember very clearly trying out Opus several times when it became touted as best-in-class and actually recoiling from the foreign feel of the Opus model. I remember very well that my problem was that it was way too eager and pretty hard to steer. I returned to Sonnet and I've used ONLY Sonnet ever since. I have/had access to Fable and Opus4.8 but I never once tried them. In the early days with Sonnet3/4.5, I bought ChatGPT, I also remember thinking that it was a great teacher but a lazy coder. You'd get the scaffolding and then '# rest of code block' not full implementation so unless you wanted to learn the concept, weigh trade-offs, ask clarifying questions or jump into a rabbit hole... You had to go code it yourself. ChatGPT generally as a model is a very good teacher so much so that the free version is enough and I use the free in combination with the most advanced Sonnet model for actual SWE day to day. And whenever there's an Opus release I'm actually very excited because it means there's a smarter Sonnet model OTW. I'll actually be veryyy very sad if the Sonnet line gets sunset. There has been no Sonnet upgrades since even as other family lines get improved.

Do note that I only use LLMs in the ChatUI, I never use agents. I don't believe having a blackbox codebase managed by entities with a half-life of 'delete conversation' or 200k tokens is a responsible idea. In ChatUI, I lay the ground rules, kill assumptions about our working relationship, give it foundational context on the problem and codebase we're working on, explain the problem and then we have a conversation about it and I gradually disclose more logically context as it becomes relevant. So, to directly answer your question, maybe I'm missing out on a ton of upside by not using the absolute best but I'd say familiarizing yourself with a specific model has all the benefits of having a human friend you've grown up with... except your buddy's a savant and would absolutely love to help!

reply