upvote
To be clear, that's not estimated price, it's actual price I paid across all the real games. My hope is you'll see it trend down over time as I find more ways to make the harness token-efficient :)
reply
That's even more interesting then! It would be cool if you added a price to performance column. Even if it's just for this one task, it's still interesting.
reply
Performance is tricky to measure. Right now the best measure of performance I've got is the "blunder index", but that's currently flagging a lot of stuff that I really don't consider to be true blunders - I think my top priority for the next few evenings is going to be iterating on the blunder-annotator, and that'll help me identify what issues in the actual gameplay code to focus on. And the blunder index isn't really defined in such a way that you can do arithmetic on it meaningfully :)
reply