undefined

points

by pllbnk18 hours ago |

comments

by lelanthran18 hours ago|

[-]

> If LLMs stop improving at the pace of the last few years (I believe they already are slowing down)

Depending on how you measure "improvement" they already have or they never will :-/

Measuring capability of the model as a ratio of context length, you reach the limits at around 300k-400k tokens of context; after that you have diminishing returns. We passed this point.

Measuring capability purely by output, smarter harnesses in the future may unlock even more improvements in outputs; basically a twist on the "Sufficiently Smart Compiler" (https://wiki.c2.com/?SufficientlySmartCompiler=)

That's the two extremes but there's more on the spectrum in between.

by rgbrenner17 hours ago|

parent|

[-]

300k-400k isn’t the current limit if you create modules and/or organize the code reasonably.. for the same reason we do this for humans: it allows us to interact with a component without loading the internals into out context.

you can also execute larger tasks than this using subagents to divide the work so each segment doesn’t exceed the usable context window. i regular execute tasks that require hundreds of subagents, for example.

in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens. it just requires you to structure the work so it can be done effectively — not so dissimilar to what you would do for a person

by jmalicki17 hours ago|

parent|

[-]

That makes it not a context window.

How to organize code like you said, and how agents interact with it, to keep the actual context window small is the fundamental challenge.

by lelanthran17 hours ago|

parent|

[-]

I keep getting surprised that people who are all-in on this (" i regular execute tasks that require hundreds of subagents ") don't have any idea of what is happening even a single layer below their interface to the LLM ("in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens.")

I looked at that response by GP (rgbrenner) and refrained from replying because if someone is both running hundreds of agents at a time AND oblivious to what "context window" means, there is no possible sane discourse that would result from any engagement.

by rgbrenner12 hours ago|

parent|

prev|

[-]

ok "series of context windows spread across many agents".. sure much clearer.

Doesn't change my point: the amount of code the agent can operate on is very large, if not unlimited, as long as you put even a little bit of thought into structuring things so it can be divided along a boundary.

If you let the codebase degrade into spaghetti, then the LLM is going to have the same problem any engineer would have with that. The rules for good code didn't disappear.

by jmalicki10 hours ago|

parent|

[-]

Context windows don't necessarily cleanly divide. Getting each agent to be able to task within a context window is a hard problem.

It's like like if your context window with one agent is n, your context window with 10 agents is n/10. It is some skill, but that is also where a lot of the advances are coming in.

by rgbrenner7 hours ago|

parent|

[-]

300k tokens--the useable context window of a single agent--is about 40k lines of code and you can't figure out a natural breakpoint within that code to divide up the task?

by forcedtolinux4 hours ago|

parent|

prev|

[-]

[dead]

by leptons18 hours ago|

prev|

[-]

I wish I got to hallucinate at work, and just get a pat on the head for constantly doing the wrong thing.

by pllbnk17 hours ago|

parent|

[-]

Maybe I am unlucky but I had worked with too many developers who couldn't make a good decision if their life depended on it. LLMs at least know how to convince you of their decisions with strong arguments.

by nothinkjustai15 hours ago|

parent|

[-]

Mmm, I feel it’s more common for them to just blindly agree with whatever you say.

Assistant: “I propose A”

User: “Actually B is better”

Assistant: “you’re absolutely right”

User: “actually let’s go with C”

Assistant: “Good choice, reasons”

User: “wait A is better”

Assistant: “Great decision!”

by oompydoompy7416 hours ago|