undefined

points

[-]

I apologize for the potato quality of these links, however, I have been working tirelessly to wrap my head how to reason about how agents and LLM models work. They are more than just a black box.

The first tries to answer what happens when I give the models harder and harder arithmetic problems to the point Sonnet will burn 200k tokens for 20minutes. [0]

The other is a very deep dive into the math of a reasoning model in the only way I could think to approach it, with data visualizations, seeing the computation of the model in real time in relation to all the parts.[1]

Two things I've learned are that the behavior of an agent that will reverse engineer any website and the behavior of an agent that does arithmetic are the same. Which means the probability that either will solve their intended task is the same for the given agent and task -- it is a distribution. The other, is that models have a blind spot, therefore creating a red team adversary bug hunter agent will not surface a bug if the same model originally wrote the code.

Understanding that, knowing that I can verify at the end or use majority of votes (MoV), using the agents to automate extremely complicated tasks can be very reliable with an amount of certainty.

[0] https://adamsohn.com/reliably-incorrect/

[1] https://adamsohn.com/grpo/