I'm sorry, I'm not following this at all. When you say "better candidates are exponentially more likely to pass the filter", we're still are talking about a metric, yes? A metric that can be optimized? Why would switching from a hard cutoff to some sort of stochastic filter weighted by this metric discourage optimization?
1. Optimizing for generally applicable skills that the metric is trying to measure.
2. Optimizing adversarially to hill-climb the metric.
You want candidates to do (1) and not (2). You can make them agnostic to the second by setting
d(expected gain)/d(opportunity cost) = 0
==>
expected gain \propto opportunity cost
It is the case that most metrics are logarithmic: it takes just as much effort to decrease one bit of error as the next bit. So log(score) \propto (opportunity cost) \propto expected gain
Thus, for them to be agnostic, you should filter candidates proportional to their log-score on the metric (where 0 is a perfect score). Because generally applicable skills are generally applicable, they will still benefit from improving those, they just no longer benefit from adversarial optimization, unless your score function looks very similar to others who have not adopted this filtering process.The issue with a hard cutoff is that people near the boundary are extremely incentivized to adversarially optimize, as it is usually cheaper than working on generally applicable skills and actually pays off for them. You see this phenomenon on AoPS where (esp. Californian) students talk about grinding for MATHCOUNTS instead of learning calculus.
That suggests determinism though.
I mean I agree with you overall. Either humans decision making is a system so complex it appears non-deterministic, or it is deterministic. Practically speaking, we are non-deterministic.
Let's not conflate non-deterministic with inaccurate though. Non-deterministic systems can be 100% accurate. https://en.wikipedia.org/wiki/Las_Vegas_algorithm
Implicit bias theory sparked a massive number of studies that suggested everything influenced you from the color of the room, to what the person said to you before entering.
It’s been really hard to replicate and the conclusions that have been drawn are contradictory.
To contextualize this insight in your post and basically just repeat what you are saying: The mistake is not using a non-deterministic system. The mistake could be, in some sense, using it too little. Re-evaluating the same resume 5 times and seeing a high variance in scores is a more useful signal than evaluating it once.