1. if what you want to run is built to be called from a shell (including multi-step shell functions) and not Go. This is the main appeal of forkrun in my opinion - extreme performance without needing to rewrite anything. 2. if you are running on NUMA hardware. Forkrun deals with NUMA hardware remarkably well - it distributes work between nodes almost perfectly with almost 0 cross-node traffic.
maxJobs=$(nprocs)
while read -r nn; do
code_to_parallelize "$nn"
(( $(jobs -p | wc -l) > maxJobs )) && wait -n
done < inputs
to a NUMA-Aware Contention-Free Dynamically-Auto-Tuning Bash-Native Streaming Parallelization Engine. I dare say 10 years is about the norm for going from "beginner" to "PhD-level" work.