I would guess it would be a small difference in measurable performance between zygote and a direct clean spawn, but it's one less trick an application needs to do, and it would be very helpful for libraries that spawn things. Spawning inside a library isn't always a great thing to do, but some things would really benefit from process level isolation.
[1] In case one isn't aware, the zygote pattern involves forking a 'zygote' process during application startup, and having that process do any forks that need to happen during application runtime. This reduces the cost of forking in large applications, because the zygote will have few fds open and use little memory. This lets your large application spawn new processes without delaying the application or the startup of the new processes. Some applications will spawn many zygotes to allow parallelism for spawning at runtime.
In all uses of zygotes that I have seen, here's what's really happening:
- `fork` is being used to reduce the cost of starting a process that has a high start-up cost. So, you start one process, run it through the expensive initialization, and then fork it from there to start new processes.
- To make this even faster, you have a pool of pre-forked processes sit around.
- Having pre-forked processes sitting around ready to be used is not expensive because of the CoW property and the fact that a process that forks and then immediately pauses will not have triggered any significant CoW yet.
So, the zygote optimization you speak of is in practice only meaningful on top of systems that are using an optimization uniquely enabled by `fork` (avoiding process initialization costs by cloning a process), and that zygote optimization is further optimized by another property of `fork` (memory sharing of forked processes that haven't done anything else yet).
> A zygote process is one that listens for spawn requests from a main process and forks itself in response. Generally they are used because forking a process after some expensive setup has been performed can save time and share extra memory pages.
I think reading the first sentance and stopping covers my zygote, but adding the second sentance covers yours. So I think we're both right!
I think both paths are useful. If your children need time to startup and become ready, spawn one that does start up work, and then it (pre)forks at the ready state to have processes ready to handle requests (your zygote). This does require a traditional fork() to avoid duplication of work.
But if forking is expensive at runtime because you have a million FDs open and a whole lot of memory allocations, spawn spawners before you start doing work (my zygote). This could be unnecessary with a inexpensive way to spawn a new process from an process that has lots of resources in use.
Of course, you can also use my zygotes to spawn your zygotes. Zygoteception.
[1] https://chromium.googlesource.com/chromium/src/+/HEAD/docs/l...
While I’ve not bothered to profile it, but it seems that process that have lot of mapped pages is the issue (firefox, emacs,…). In the emacs case, the issue is when the main process trying to fork-exec, if I start a shell session (with shell-mode or term-mode), it works fine.
It's called clone(2)
Yes, zygote pattern makes it easy to make fork() into bottleneck - it requires a lot more discipline and low level tricks (linker scripts, compiler-specific extensions, custom sections, low level dependencies on pagesize that get "fun" on ARM servers).
If you don't, you might wake up with fork() causing latency issues.