The author claimed that the models he modified with this layer repetition method topped the huggingface open llm leaderboard in his first post:
https://dnhkng.github.io/posts/rys/Do you remember the names of the previous experiments done on this? Would love to take a look.