upvote
Pauses are a problem with heap size and structure, not allocation rate, because the pause is caused by GC code that is O(heap size). Making garbage slower reduces the frequency but not severity. This is an issue with most GCs to some degree, there are phases of collection where the GC stops execution and the duration is relative to how much work it has to do which is based on how many objects and how much memory needs to be checked. "Concurrent" garbage collection is the approach of trying to reduce the pauses by doing more of the work while program execution continues. It's complicated and hard to get right, so Go's original GC was IIRC fully stop-the-world.

There are some fine points to the O(heap size), for example it's clearly unnecessary for the GC to scan objects that do not themselves contain pointers, and work is somewhat proportional to the total number of objects. Combining numerous small objects into manually managed slices, coming up with ways to make the most numerous items pointer-free, etc.

I learned a bit about this when an analytics workload I had ended up with unacceptable pauses (I think over 1 second), Go's GC is more sophisticated now but I think in any GC runtime you have the same considerations to some degree. Some of the best writing at the time was by Gil Tene, one of the principal authors of the C4 concurrent collector at Azul Systems, starting point here:

https://groups.google.com/g/golang-dev/c/GvA0DaCI2BU/m/SmEel...

reply
With backend serving many clients with widely varying performance profile of individual requests when latency spikes happen there is no particular hot loop. Just many go routines each doing reasonable thing but with a particular request pattern hitting pathological case of GC.
reply
Extreme variance in usage patterns will always be challenging. But pre allocating some reasonably sized buffers can go a long way.
reply
Removing enough allocations to avoid fragmentation can be maddenly difficult/tedious.
reply