I have a bash loop that looks like:
for i in $(seq 0 $max); do my_command $i
doneand I would like to run this in parallel on n cores. I know that I could do
while [[ "$j" -le "$max" ]]; do for i in $(seq 1 $ncores); do my_command $j & done wait
donebut if my_command's runtime is linear in $i, then I am wasting CPU cycles by waiting on the longest running function. How can I continually dispatch new jobs so that $ncores jobs are running at any given time? Do I need to run an actual job scheduler like torque locally on my machine to accomplish this or can I do this with a simple bash script?
2 Answers
Use GNU Parallel:
seq 0 $max | parallel my_command {} 2 or use xargs:
seq 1 $max | xargs -n1 -P$ncores -I% mycommand %To see how it works:
seq 1 9 | shuf | xargs -n1 -P3 -I% sh -c 'echo start %; sleep %; echo stop %'