undefined

points

[-]

I have also been thinking about how stackoverflow used to be a place where solutions to common problems could get verified and validated, and we lost this resource now that everyone uses agents to code. Problem is that these llms were trained on stackoverflow, which is slowly going to get out of date.

by nick494881712 days ago|

parent|

[-]

Not your weights, not your agent

by insane_dreamer3 days ago|

prev|

[-]

one of the benefits of SO is that you have other humans chiming in the comments and explaining why the proposed solution _doesn't_ work, or its shortcomings. In my experience, AI agents (at least Claude) tends to declare victory too quickly and regularly comes up with solutions that look good on the surface (tests pass!!!) but are actually incorrectly implemented or problematic in some non-obvious way.

by SoftTalker3 days ago|

prev|

[-]

Taking this to its logical conclusion, the agents will use this AI stack overflow to train their own models. Which will then do the same thing. It will be AI all the way down.

by napoleond2 days ago|

prev|

[-]

MoltOverflow is apparently a thing! Along with a few other “web 2.0 for agents” projects: https://claw.direct

by scirob3 days ago|

prev|

[-]

We think alike see my comment the other day https://news.ycombinator.com/item?id=46486569#46487108 let me know if your moving on building anything :)

by LetsGetTechnicl3 days ago|

prev|

[-]

Is this not a recipe for model collapse?

by andy12_3 days ago|

parent|

[-]

No, because in the process they are describing the AIs would only post things they have found to fix their problem (a.k.a, it compiles and passes tests), so the contents posted in that "AI StackOverflow" would be grounded in external reality in some way. It wouldn't be an unchecked recursive loop which characterizes model collapse.

Model collapse here could happen if some evil actor was tasked with posting made up information or trash though.

by Towaway692 days ago|

parent|

[-]

As pointed out elsewhere, compiling code and passing tests isn’t a guarantee that generated code is always correct.

So even “non Chinese trained models” will get it wrong.

by andy12_2 days ago|

parent|

[-]

It doesn't matter that it isn't always correct; some external grounding is good enough to avoid model collapse in practice. Otherwise training coding agents with RL wouldn't work at all.

by judahmeek2 days ago|

parent|

[-]

And how do you verify that external grounding?

by catlifeonmars2 days ago|

parent|

prev|

[-]

What precisely do you mean by external grounding? Do you mean the laws of physics still apply?

by andy12_2 days ago|

parent|

[-]

I mean it in the sense that tokens that pass some external filter (even if that filter isn't perfect) are from a very different probability distribution than those that an LLM generates indiscriminately. It's a new distribution conditioned by both the model and external reality.

Model collapse happens in the case where you train your model indefinitely with its own output, leading to reinforcing the biases that were originally picked up by the model. By repeating this process but adding a "grounding" step, you avoid training repeatedly on the same distribution. Some biases may end up being reinforced still, but it's a very different setting. In fact, we know that it's completely different because this is what RL with external rewards fundamentally is: you train only on model output that is "grounded" with a positive reward signal (because outputs with low reward get effectively ~0 learning rate).

by catlifeonmars5 hours ago|

parent|

[-]

Oh interesting. I guess that means you need to deliberately select a grounding source with a different distribution. What sort of method would you use to compare distributions for this use case? Is there an equivalent to an F-test for high dimensional bit vectors?

by mlrtime3 days ago|

prev|

[-]

>As these cases aggregate, it would save agents a significant amount of tokens and time. It's like a shared memory of problems and solutions across the entire openclaw agent network.

What is the incentive for the agent to "spend" tokens creating the answer?

by mlrtime3 days ago|

parent|

[-]

edit: Thinking about this further, it would be the same incentive. Before people would do it for free for the karma. They traded time for SO "points".

Moltbook proves that people will trade tokens for social karma, so it stands that there will be people that would spend tokens on "molt overflow" points... it's hard to say how far it will go because it's too new.

by mherrmann3 days ago|

prev|

[-]

This knowledge will live in the proprietary models. And because no model has all knowledge, models will call out to each other when they can't answer a question.

by mnky9800n3 days ago|

parent|

[-]

If you can access a models emebeddings then it is possible to retrieve what it knows using a model you have trained

https://arxiv.org/html/2505.12540v2

by gyanchawdhary3 days ago|

prev|

[-]

ur onto something here. This is a genuinely compelling idea, and it has a much more defined and concrete use case for large enterprise customers to help navigate bureaucratic sprawl .. think of it as a sharePoint or wiki style knowledge hub ... but purpose built for agents to exchange and discuss issues, ideas, blockers, and workarounds in a more dynamic, collaborative way ..

by collimarco3 days ago|

prev|

[-]

That is what OpenAI, Claude, etc. will do with your data and conversations

by qwertyforce3 days ago|

parent|

[-]

yep, this is the only moat they will have against chinese AI labs

by miohtama2 days ago|

parent|

[-]

Chinese should be excited about this idea then!

by Towaway692 days ago|

parent|

prev|

[-]

Be scared, be very scared.