This model has always been a trade-off. It has downsides, but it also has upsides, including an immense boost in flexibility; decoupling from any particular userspace is useful.
> This way, you can just LD_PRELOAD in front of the VDSO and system call interception in userspace Just Works.
Can you LD_PRELOAD in front of the vDSO? I was under the (possibly mistaken) impression that the kernel injects it directly.
The kernel puts the vDSO in memory and tells ld.so where it is, but where if anywhere ld.so will put it in the search order it implements is its own concern. (TBH I don’t actually know whether ld.so will actually allow LD_PRELOAD to override the vDSO, but there’s no reason for it not to, except I guess for the syscalls that are needed to perform the dynamic linking itself.)
This syscall overhead isn't as much as you suppose it is; for workloads where the syscall overhead actually makes a difference there are robust low-syscall paths for io/latency sensitive operations with DPDK, io_uring, and futex being a few examples.
And there are robust performant methods on linux for syscall interception/tracing, see seccomp unotify, bpf tracepoints, ftrace.
Which makes containers crap on Windows and *BSD as they have to run the currect libc or equivalent. Thus you need to build a different container per OS version which sucks compared to Linux.
Or, you know, just propose your idea to him