There have been simplistic attempts at this, e.g. instead of performing 100 tests, just keep going as long as coverage increases.
The Choice Gradient Sampling algorithm from https://arxiv.org/pdf/2203.00652 feels like a nice way to steer generators in a more nuanced way. That paper uses it to avoid discards when rejection-sampling; but I have a feeling it could be repurposed to "reward" based on new coverage instead/as-well.