Then there's the question of trust. You probably have friends you know not to tell certain secrets to, because they believe they get to delegate your secrets onwards to people they trust. The further away someone is from you, the less respect they will show. Researchers have been loaning the dataset in good faith to people who they trust, but who probably didn't take the whole secrecy thing as seriously.
With 20k researchers this was inevitable. The kind of factors above need to be factored in when designing on what grounds such a dataset is to be released.
Reckless harm prevention is the root of many evils.