undefined

points

[-]

Nondeterminism is also a feature, not a bug. If you don't want people to optimize against your filtering process, you have to make it somewhat nondeterministic. For example, better candidates are exponentially more likely to pass the filter, instead of a hard cut-off at the top-100. Then it becomes no longer worthwhile to Goodhart the filtering process, because it barely increases your chances and there are so many more places you can use your time better.

by 12_throw_away1 days ago|

parent|

[-]

> If you don't want people to optimize against your filtering process, you have to make it somewhat nondeterministic.

I'm sorry, I'm not following this at all. When you say "better candidates are exponentially more likely to pass the filter", we're still are talking about a metric, yes? A metric that can be optimized? Why would switching from a hard cutoff to some sort of stochastic filter weighted by this metric discourage optimization?

by programjames23 hours ago|

parent|

[-]

Optimizing for the metric involves:

1. Optimizing for generally applicable skills that the metric is trying to measure.

2. Optimizing adversarially to hill-climb the metric.

You want candidates to do (1) and not (2). You can make them agnostic to the second by setting

    d(expected gain)/d(opportunity cost) = 0
      ==>
    expected gain \propto opportunity cost

It is the case that most metrics are logarithmic: it takes just as much effort to decrease one bit of error as the next bit. So

    log(score) \propto (opportunity cost) \propto expected gain

Thus, for them to be agnostic, you should filter candidates proportional to their log-score on the metric (where 0 is a perfect score). Because generally applicable skills are generally applicable, they will still benefit from improving those, they just no longer benefit from adversarial optimization, unless your score function looks very similar to others who have not adopted this filtering process.

The issue with a hard cutoff is that people near the boundary are extremely incentivized to adversarially optimize, as it is usually cheaper than working on generally applicable skills and actually pays off for them. You see this phenomenon on AoPS where (esp. Californian) students talk about grinding for MATHCOUNTS instead of learning calculus.

by RugnirViking1 days ago|

prev|

[-]

This. Human judges and examiners are famously not deterministic even though we would wish it were so - we've probably all heard the thing of harsher sentences being given in the hour before lunch.

by nonethewiser1 days ago|

parent|

[-]

>we've probably all heard the thing of harsher sentences being given in the hour before lunch

That suggests determinism though.

I mean I agree with you overall. Either humans decision making is a system so complex it appears non-deterministic, or it is deterministic. Practically speaking, we are non-deterministic.

Let's not conflate non-deterministic with inaccurate though. Non-deterministic systems can be 100% accurate. https://en.wikipedia.org/wiki/Las_Vegas_algorithm

by groundzeros20151 days ago|

parent|

prev|

[-]

> harsher sentences being given in the hour before lunch.

Implicit bias theory sparked a massive number of studies that suggested everything influenced you from the color of the room, to what the person said to you before entering.

It’s been really hard to replicate and the conclusions that have been drawn are contradictory.

by nonethewiser1 days ago|

prev|

[-]

I made a similar comment on a different post. Non-determinism does not necessarily mean it cannot reliably reach the correct output (although sometimes it does mean that). Las Vegas algorithims are non-deterministic and 100% accurate. The tradeoff is the time it takes to reach the correct answer is highly variable.

To contextualize this insight in your post and basically just repeat what you are saying: The mistake is not using a non-deterministic system. The mistake could be, in some sense, using it too little. Re-evaluating the same resume 5 times and seeing a high variance in scores is a more useful signal than evaluating it once.