undefined

points

[-]

The problematic part is mostly in nomenclature. They’re called “threads” but don’t really behave the way you’d expect threads to.

They’re heavy, they don’t share the entire process memory space (ie can’t reference functions), and I believe their imports are separate from each other (ie reparsed for each worker into its own memory space).

In many ways they’re closer to subprocesses in other languages, with limited shared memory.

It’s not “clean” to spin up thousands of threads, but it does work and sometimes it’s easier to write and reason about than a whole pipeline of distributing work to worked threads. I probably wouldn’t do it in a server, but in a CLI I would totally do something like spawn a thread for each file a user wants to analyze and let the OS do task scheduling for me. If they give me a thousand files, they get a thousand threads. That overhead is pretty minimal with OS threads (on Linux, Windows is a different beast).

by whizzter4 hours ago|

parent|

[-]

No, they're threads as far as the OS is concerned (they'll map to OS threads) and actually _do_ share physical process and memory (that's how SharedArrayBuffer works).

However, apart from atomic "plain" memory no objects are directly shared (For Node/V8 they live in so called Isolated iirc) so from a logical standpoint they're kinda like a process.

The underlying reason is that in JavaScript objects are by default open to modification, ie:

  const t = {x:1,y:2};
  t.z = 3;
  console.log(t); // => { x: 1, y: 2, z: 3 }

To get sane performance out of JS there are a ton of tricks the runtime does under the hood, the bad news is that those are all either slow (think Python GIL) or heavily exploitable in a multithreaded scenario.

If you've done multithreaded C/C++ work and touched upon Erlang the JS Worker design is the logical conclusion, message passing works for small packets (work orders, structured cloning) whilst large data-shipping can be problematic with cloning.

This is why SharedArrayBuffer:s allows for no-copy sharing since the plain memory arrays they expose don't offer any security surprises in terms of code execution (spectre style attacks is another story) and also allows for work-subdivision if needed.

by jwilliams5 hours ago|

parent|

prev|

[-]

If it’s any comfort, I don’t hear many JS/TS/Node/etc developers calling them threads or really thinking of them that way. Usually just Workers or Web Workers — "worker threads" mostly slips in from Node. Even then, "worker" dominates.

In terms of tradeoffs, if you’re coming from the single event loop model, they’re pretty consistent with the rest of JS. Isolation-first, explicit sharing, fewer footguns. So I think the tradeoffs are the right tradeoffs.

FWIW, traditional threads have their own tradeoffs (especially around IO). In JS that’s mostly a non-issue, so the "I need 1000s of threads" case just doesn’t come up very often.

by kketch4 hours ago|

parent|

prev|

[-]

I think the isolation and memory safety guarantees that worker threads (or Web Workers) provide are very welcome. The friction mainly comes from ergonomics, as pointed out in the article. So there’s definitely room for improvement there (even within the current constraints).

A worker thread or Web Worker runs in its own isolate, so it needs to initialise it by parsing and executing its entry point. I'm not quite sure whether that's something that already happens but you could imagine optimising this by caching or snapshotting the initial state of an isolate when multiple workers use the same entry point, so new workers can start faster.

That cannot be done with the original main thread isolate because usually the worker environment has both different capabilities than the main isolate and a different entry point.

If I have to handle 1000 files in a small CLI I would probably just use Node.js asynchronous IO in a single thread and let it handle platform specifics for me! You’ll get very good throughput without having to handle threads yourself.