undefined

points

[-]

If you think statefull LLMs would be easier to handle then stateless... Then I think you haven't done a lot of software engineering

by zsyllepsis3 hours ago|

parent|

[-]

Maybe a charitable reading of the parent comment, but my interpretation of it was that while the _models_ are stateless, modern deployments of these models for inference rely on state.

For example, tiered pricing for cached context relies on state, even if the models don’t.

by zahlman2 hours ago|

parent|

[-]

For that matter, the agent harness accumulating "chat history" is state.

by seanmcdirmid2 hours ago|

parent|

prev|

[-]

Of course you can pass in your own state, but I always wondered about an LLM that has conversation context stay resident in GPU memory somehow.

Or maybe this already effectively covered by context caching and the gains would be minimal (stateless, but if you pass in the same context or the same head context, it’s already in GPU memory and doesn’t need to be loaded?).

by random32 hours ago|

parent|

prev|

[-]

This. lol. If you think state makes things easier you're in for a big surprise.

by tossandthrow4 hours ago|

prev|

[-]

That does not seem to be related to llms? It is more about the harness that utilizes them, right?

by aerhardt52 minutes ago|

parent|

[-]

The parent comment is AI generated drivel, that’s why. Incredible that it has generated such a lively discussion.

by talkin4 hours ago|

parent|

prev|

[-]

It costs tokens, so it helps the business model, so it’s not a bug but a feature.