undefined

points

by rapatel06 hours ago |

comments

by skerit4 hours ago|

[-]

This is kind of what LoopLM is doing, no? https://arxiv.org/abs/2510.25741

by dnhkng4 hours ago|

prev|

[-]

Yes, I've tried duplicating indvidual layers, but its not useful.

I think this hasn't been tried before because it's totally unintuitive that feeding the output from later layers into previous ones would actually do anything. And in fact, it usually is detrimental. I guess it takes really bored hobbyists with too much compute to check this stuff.

I have done some interesting work on applying multiple layer duplications in different regions of the model too, going so far as to train a meta-model (actually just XGBoost) to predict the merges. Seems to work, buts thats a whole other blog post.

This works with MoE, and yes, I would be interested in looking into this in more detail. But my wife might disagree with this time sink...

by rapatel01 hours ago|

parent|

[-]

Clarification. Duplicating multiple groups of layers in a "reasoning" loop

Normal:

  L1 -> L2 -> L3 -> L4 -> out

Unrolled (current framing):

  L1 -> [L2->L3] -> [L2->L3] -> L4 -> out

Looped (proposed):

       --<--loop----
       |           |

  L1 -> [L2->L3] x N --> L4 -> out

"reasoning loop"

Note: ascii rendering HN is not trivial

by gavinray1 hours ago|

parent|

[-]

The commenter "Skerit" below linked to a recent implementation of this:

https://ouro-llm.github.io/

See the left-hand side of the diagram here, which is your exact proposal:

https://ouro-llm.github.io/static/images/ouro_main.png