upvote
But you generally want to communicate with that process, so you do need to setup e.g. file descriptors and stuff, which needs information from the parent process to be passed.
reply
Yes, you do want to pass in some stuff. But by default you get every single open file descriptor and a copy of every single stack that any threads use for execution.

It shares way too much, and have huge use cases where it is really, really bad.

reply
Most programming languages abstract this out to be able to connect or drop the 3 standard pipes. Typically this is the only thing that can be shared anyway unless the other program is specifically shared and expects other file handles to be available, in which case fork might be the right system call anyway.
reply
Keep in mind that this is the only way to start any process. Even if you just want to launch some throwaway utility program.
reply
What do you mean by "a completely new process"?
reply
A process that shares nothing with the process that spawned it.
reply
A thing that makes that complicated is that while you want that conceptually, you don't want that in reality. For instance, if the spawning process is in a container of some sort and it spawned a process that "shares nothing with the process that spawned it", the spawned process would no longer be in that container, because the state of "being in the container" is one of the things it shares with the parent process.

This is just an example of I don't even know how many things a modern-day process will share from its parent.

By "complicated" I do not even remotely mean "unsolvable". I just mean that if you really dig down into what it means to "share nothing" in a modern operating system, it's a lot richer than it was back when fork+exec was a practical solution. There's a lot of fuzzy things that could go either way when you say "shares nothing".

reply
It’s such a bad idea that every OS except Linux implements it? On macOS it’s posix_spawn, on Windows it’s NtCreateProcess.
reply
Who said anything about it being a "bad idea"?

I also explicitly said this wasn't unsolvable. My point isn't about technical implementations or code, my point is that the casual "I want to share nothing about the parent process" thought in sanderj's mind, and presumably a lot others, is much more ill-defined than they realize. There's a lot more state that a process has than what file descriptors are open in a modern system.

Moreover, as things like "in which container is this running" demonstrate, those are also not "create a process that has nothing to do with this process", because, again, there's a lot more to "having to do with this process" than "what file descriptors are open".

Also, as the name might have been a clue, Linux has posix_spawn: https://linux.die.net/man/3/posix_spawn. It also has a thing called "clone": https://www.man7.org/linux/man-pages/man2/clone.2.html Nor do I claim this paragraph is an entire overview of all the ways of starting a process in Linux. If you want to understand what I mean by "lots of details in a modern OS", your assignment is to carefully read the entire "clone" man page, and you'll start to see what I mean, though I'm not sure even that is all the state associated with a process nowadays.

reply
Linux posix_spawn is a wrapper around clone and exec. There is no primitive on Linux to create an entirely blank process. This is adequately discussed in the linked LWN post.

Other operating systems either have parallel APIs to fork (e.g. the posix_spawn syscall on macOS) or do not provide fork at all (Windows).

reply
That’s how you get zombie processes and memory leaks.
reply
Isn't that covered by O_CLOEXEC?
reply
There's a bunch of nastiness around that too. If you have e.g. library state that assumes the fd still works you can get her very confusing bugs once another file is opened into that fd number...
reply
You may be mixing up fork and exec. Library data state isn't retained over execve(), and O_CLOEXEC does not take effect at fork().
reply
Indeed. Not enough coffee, apparently.
reply
>It feels crazy that we don't have a way to directly express the latter thing

Isn't that what posix_spawn is for?

reply
And how do you think posix_spawn is implemented?
reply
This is an oft-overlooked point. An obvious place to look for improving fork+execve is to see whether posix_spawn can be given more efficient kernel mechanisms to be based upon.

And of course that has already been done. On NetBSD, posix_spawn() is a fully-fledged system call and much of the work is done in kernel mode.

* https://blog.netbsd.org/tnf/entry/posix_spawn_syscall_added

reply
This is literally discussed in the article this post links to.
reply
Not really. They didn't get anywhere near as far as noticing the prior art of NetBSD, not even on the mailing list discussion behind that article.
reply
posix_spawn addresses the need from userspace. Under the hood, it's still doing more or less a fork/exec, with the baggage that comes with it. A syscall would be nicer.
reply