upvote
Two parts here.

First, well-calibrated systems for detecting API compromise is a good thing (or good intent at least). Credential malware is exploding.

Second, the challenge is that significant amount of genuine work — such as evals — seems practically impossible to distinguish from generating RLAIF outputs.

reply