It's pretty great that you are providing the undistilled model on day 0. Here's a pro-tip: With Flux.2 Klein, someone created a turbo slider LoRA - basically a diff of the turbo 9B model vs. the undistilled 9B model. What's great about this LoRA is that you can sample using a heavier weighting of the undistilled weights during early sampling steps and then finish the sampling off with mostly the distilled weights. The result is a better "finish" (taking advantage of the distilled model's refinement for image quality) without sacrificing the undistilled model's greater ability to adhere to the prompt, because the undistilled model doesn't have to devote its weights so much to looking good.
reply