upvote
>what's the false positive / negative rate of both approaches

the false positive rate is 100%. they just say everything is phishing:

"When we ran the full dataset through the deep scan, it caught every single confirmed phishing site with zero false negatives. The tradeoff is that it flagged all 9 of the legitimate sites in our dataset as suspicious, which is worth it when you're actively investigating a link you don't trust."

reply
it's 100% for what they call "deep scan", it's 66.7% for the "automatic scan". Practically unusable anyway
reply
Probably could have been a bit more descriptive around the dataset. Our tooling pulls in a lot more than 250 URLs but since we are manually confirming them that means a smaller dataset. In other words, out of the urls we pulled in these 250 were confirmed (by a human) as phishing. We did not do any selection beyond that. As for the article LLMs were used to help with the graphs and grammatical checks but that's it. This was our first month of going through this exercise and we definitely want to have larger datasets going forward as we expand capacity for review.

As for Safe Browsing catching more than 16% it depends on the timeline at the time these attacks are launched it's likely Safe Browsing catches closer to 0% but as the time goes on that number definitely climbs.

reply