People get excited about custom allocators until they hit subtle fragmentation bugs or botch thread safety under load. Explicit heaps are nice for metrics and debugging, but once a busy codebase has objects crossing module boundaries, the lifetime rules get ugly fast and the malloc overhead you were trying to dodge often looks cheap.
Most teams get this wrong. If you care about valgrind, ASan, and heap dumps, rolling your own allocator means giving up a lot of mature tooling for a problem that usually isn't where the latency or memory footprint went in the first place.