This seems really weird to me. Isn't that just using LLMs in a specific way? Why come up with a new name "RLM" instead of saying "LLM"? Nothing changes about the model.
New architecture to building agent, but not the model itself. You still have LLMs, but you kinda give this new agentic loop with a REPL environment where the LLM can try to solve the problem more programmatically.