upvote
> If you have to wait all night for a response and then possibly discover that it took the wrong direction, misunderstood your intent, or your prompt was missing some key information then you have to start over.

If you have to wait overnight because the model is offloading to disk, that's a model you wouldn't have been able to run otherwise without very expensive hardware. You haven't really lost anything. If anything, it's even easier to check on what a model is doing during a partial inference or agentic workload if the inference process is slower.

reply
"If you have to wait all night for a response and then possibly discover that it took the wrong direction, misunderstood your intent, or your prompt was missing some key information then you have to start over."

This exact problem exist for rendering, when you realize that after a long render an object was missing in the background and the costly frame is now useless. To counter that you make multiple "draft" renders first to make sure everything is in the frame and your parameters are properly tuned.

reply