upvote
> No, I mean, if you're WhatsApp - across all nodes - then somehow maybe yes?

I mean, we had one process per client connection (which is 100% the way to go) and depending on the era, hundreds of thousands or millions of connections per chat node. I don't think we ever really summed the number of processes over a cluster.

Other than client processes, there weren't that many processes per node; like you say, it doesn't make sense to spread too thin.

There's a lot of client connections and so a lot of client processes, but it ends up being pretty simple to work with them. They all do the same thing... wait for a message, process the message, wait some more. Some of the messages are tricky to process (like the user just logged in again over here, so please transfer the state)

reply
I learned it for almost a full year by trying to build a live chat app. I went through Elixir in Action and the official guides and yet those questions were never really answered. I never said I want hundreds of thousands of processes, but thats definitely a thing you need to account for. Errors are often simply swallowed.
reply
> Errors are often simply swallowed.

That's a bit of a misrepresentation. Error handling on the BEAM has a few more layers than in other environments; specifically, the supervision tree can be used to "let things fail". That's not the layer where you should log or handle failures - that's a safety net that ensures your whole system won't go down if your error handling in a single process doesn't work.

For error handling, there are roughly these layers:

    - functions can return {:ok, value} or {:error, error}
    - functions can raise errors (similar to exceptions) that can be caught
    - processes can be monitored from the outside, you get notified when they die
    - processes can be linked and exits can be trapped, also notifying you on failure
    - supervisors can handle process deaths in a configurable manner
    - higher-level behaviours often expose their own error handling callbacks
So there's a bit more to error handling on the BEAM, and I get that becoming familiar with all of them and using them properly can be a challenge. The defaults skew towards high-availability, which is not always what you want in development - sometimes, failing fast and completely (up to stopping the app or the BEAM as a whole) is more convenient. You can have that; you just need to ask for it specifically in your code.
reply
> Errors are often simply swallowed.

That's a choice, but it's not idiomatic.

You're expected to write things like...

    ok = thing_that_might_not_work().
(Well, that's what it looks like in Erlang anyway). If there's an error, it doesn't match, so it crashes. You don't have to check for success, but it's easy to, and 'let it crash' is the mantra, so yeah. Then you watch for crashes, and fix them with hot loading, and pretty soon you have a reliable system.

Let it crash ends up not quite working, so you end up catching a lot of errors, but you should be logging them, not swallowing them...

reply