upvote
Every commit. Every diff between 2 different commits. Every diff with different query parameters. Git blame for each line of each commit.

Imagine a task to enumerate every possible read-only command you could make against a Git repo, and then imagine a farm of scrapers running exactly one of them per IP address.

Ugh.

reply
Ugh Ugh Ugh ... and endless ughs, when all they needed was "git clone" to get the whole thing and spend as much time and energy as they wanted analyzing it.
reply
Yuk…

   http {
       # ... other http settings
       limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;
       # ...
   }


    server {
        # ... other server settings
        location / {
            limit_req zone=mylimit burst=20 nodelay;
            # ... proxy_pass or other location-specific settings
        }
    }

Rate limit read-only access at the very least. I know this is a hard problem for open source projects that have relied on web access like this for a while. Anubis?
reply
We used fail2ban to do rate limiting first. It wasn't adequate.
reply
Ooof, maybe a write up is in order? An opinioned blog post? I'd love to know more.
reply