They generate a URL for every version of every file on every commit and every branch and tag, and if that wasn't enough, n(n+1)/2 git diffs for every file on every commit it has exited on. Even a relatively small git repo with a few hundred files and commit explodes into millions of URLs in the crawl frontier. Server side many of these are very expensive to generate as well so it's really not a fantastic interaction, crawler and git host.
If you run a web crawler, you need to add git host detection to actively avoid walking into them.
From the shape of the traffic it just looks like a poorly implemented web crawler. By default, a crawler that does not take measures to actively avoid git hosts will get stuck there and spend days trying to exhaust the links of even a single repo.
I do think they care about repos, and not just the code, but also how it evolves over time. I can see some use, if marginal in those traits. But if they really wanted that, I'd rather they clone my repos, I'd be totally fine with that. But i guess they'd have to deal with state, and they likely don't want to deal with that. Rather just increase my energy bill ;)