This is an oft-overlooked point. An obvious place to look for improving fork+execve is to see whether posix_spawn can be given more efficient kernel mechanisms to be based upon.
And of course that has already been done. On NetBSD, posix_spawn() is a fully-fledged system call and much of the work is done in kernel mode.
posix_spawn addresses the need from userspace. Under the hood, it's still doing more or less a fork/exec, with the baggage that comes with it. A syscall would be nicer.