undefined

points

[-]

Could you explain what you mean? That feels like a waste of processing to me. Yes the model will correct itself once it eventually run a compiler/linter. But that's still wasted time and compute

by sometimelurker55 minutes ago|

prev|

[-]

ehh ur right but there's a lot of nuance here. if you have a system that doesn't hallucinate a ton and is still very "creative" that's great, and probably much better than a hallucinating system regardless of its creativity. I'm reminded of theoremproving LLMs working in lean producing millions of slop proofs until one works, but if you have something like that simple RLVR should fix it (external oracle can be the judge for the RL.