I don't include mobile proxies since they're heavily shared, so knowing that an IP address was used as a proxy at some point is basically useless.
Regarding your remark, indeed, there are several shared residential IPs, including IPs of legitimate users who may have a shady app that routes traffic through their device. That's why I don't recommend blocking using IP addresses as is. It's supposed to be more of a datapoint/signal to enrich your anti-fraud/anti-bot system. However, regarding the block list, I analyze the IPs on bigger time frames, the percentage of IPs in the range that were used as proxies, and generate a confidence score to indicate whether or not it is safe to block.
I’m working on a scraping project at the moment so looking at this too but from the other end. Super low volume though so pretty tame - emphasis on success rate more than throughput
I bought a 4G dongle for use as last resort if nothing else gets through. And also investigating ipv6
Currently planning on doing a layered approach. Cloud IPs first etc.
Interesting challenge but also trying to be somewhat respectful about it since nobody likes aggressive bots