They simply don't have (or didnt) the skills to scale. THey were talking about using ceph to run things (which gives you an idea about how green their infra team was)
Its slow, large, excessively complex and not that resilient to failure.
You either want a bunch of NFS machines backed on to ZFS on nvme, with a central jumping off point that allows sharding (this is critical to allow one or more NFS server to fuck up and not kill access to everything else.)
Or, pay the money and use GPFS
Done correctly, Ceph is extremely reliable, resilient, and fast. Once you get over the initial learning curve, dare I say, even a joy to work with.
https://docs.github.com/en/enterprise-cloud@latest/admin/dat...