upvote
The problem with these exercises is always: I have limited time and capacity to do things, and a fairly unlimited number of problems that I can think of to solve. Coding is not a problem I want to solve. Prompt engineering is not a problem I want to solve.

If I do things for the love if it, the rules are different of course. But otherwise I will simply always accept that there are many things that improve around me, that I have no intimate knowledge of and probably never will, and I let other people work them out and happily lean on their work to do the next thing I care about, that is not already solved.

reply
> Well, surely none of this is relevant today? Well, it turns out all of it still is! e.g. small fix, the "grep def" (or your language's equivalent) can be trivially added as a startup hook to Claude Code, and suddenly it doesn't have to spend half your token budget poking around the codebase, because -- get this -- it can just see where everything is... (What a concept, right?)

Hahaha yeah. This is very true. I find myself making ad hoc versions of this in static markdown files to get around it. Just another example of the kind of low hanging fruit harnesses are leaving on the table. A version of this that uses tree sitter grammars to map a codebase, and does it on every startup of an agent, would be awesome.

> My Weird Hill is that we should be building things with GPT-4.

I disagree, IMO using the best models we have is a good way to avoid wasting time, but that doesn't mean we shouldn't also be frugal and clever with our harnesses!

reply
To clarify, I didn't mean we should be using ancient models in production, I meant in R&D.

Anthropic says "do the simplest thing that works." If it works with the LLMs we had 3 years ago, doesn't that make it simpler?

The newer LLMs mostly seem to work around the poor system design. (Like spawning 50 subagents on a grep-spree because you forgot to tell it where anything is...) But then you get poor design in prod!

reply