Also, while I do agree that Anthropic's internal controls are unlikely to be on the level of AWS's or Azure's. I'm pretty confident they're good enough that random PMs aren't going to get access to things like that, especially for use in formal projects. Especially since "safety" is Anthropic's other obsession, which means "safety" data are going to be watched.
But anyway, we seem to be agreed that retaining stuff that used to get flushed early is a risk, and every copy is a risk, and sending it to more companies is a risk, regardless of the fine points of how things might go wrong.
By over-eager PM, I didn't mean someone being malicious, just moving too fast to think about how some logging they set up might have negative effects for my clients way down the line.
Then months later, some other person finding a store of novel data, and being like... that looks nice... not gonna ask any questions/look a gift horse in the mouth... woohoo AGI!
On the far darker side, while I am a fan of the team at Anthropic: good intentions and all, they had to pay a $1.5B settlement for knowingly ingesting copyrighted books. That was just the cost of doing business. They did that, and now they are a trillion dollar company.