Better models help on the day the spam mutates, before you have fresh labels for the new scam and before spammers can infer from a few test runs which phrasing still slips through. If you need labels for each pivot you're letting them experiment on your users.
replyIn my experience the contents of the message are all but totally irrelevant to the classification, and it is the behavior of the mailing peer that gives all the relevant features.
reply