undefined

points

[-]

So, there are a few reasons why forkrun might work better than this, depending on the situation:

1. if what you want to run is built to be called from a shell (including multi-step shell functions) and not Go. This is the main appeal of forkrun in my opinion - extreme performance without needing to rewrite anything. 2. if you are running on NUMA hardware. Forkrun deals with NUMA hardware remarkably well - it distributes work between nodes almost perfectly with almost 0 cross-node traffic.

by cbsmith2 hours ago|

prev|

[-]

AFAIK, the Go runtime is pretty NUMA-oblivious. The mcache helps a bit with locality of small allocations, but otherwise, you aren't going to get the same benefits (though I absolutely here you about avoiding execve overhead).

by jkool7021 hours ago|

parent|

[-]

So...yes, the execve overhead is real. BUT there's still a lot you can accomplish with pure bash builtins (which don't have the execve overhead). And, if you're open to rewriting things (which would probably be required to some extent if you were to make something intended for shell to run in Go) you can port whatever you need to run into a bash builtin and bypass the execve overhead that way. In fact, doing that is EXACTLY what forkrun does, and is a big part of why it is so fast.

by beanjuiceII5 hours ago|

prev|

[-]

dang and u did all that without a 10 year journey

by jkool7021 hours ago|

parent|

[-]

10 years represents going from

    maxJobs=$(nprocs)
    while read -r nn; do
      code_to_parallelize "$nn"
      (( $(jobs -p | wc -l) > maxJobs )) && wait -n
    done < inputs

to a NUMA-Aware Contention-Free Dynamically-Auto-Tuning Bash-Native Streaming Parallelization Engine. I dare say 10 years is about the norm for going from "beginner" to "PhD-level" work.