GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.

https://www.gnu.org/software/parallel/

Install on your Mac OSX desktop and laptop using either brew or Macports (I know Macports works since I use it on my MacBook)

sudo port install parallel
brew install parallel

Usage

Add module

module load parallel/20170322

Example Usage

parallel echo ::: 1 2 3 ::: 4 5 6

Purpose

GNU Parallel is a tool for running a series of jobs, mostly serial jobs, in parallel. This tool is best suited for running a bunch of serial jobs that may not run for the same simulation time. For example, the most easiest way to run a series of serial jobs in parallel on n cpus within one submit script is as follows

./job_1 &
./job_2 &
...
./job_n &
wait
./job_(n+1) &
./job_(n+2) &
...
./job_2n &
wait
./job_(2n+1) &
./job_(2n+2) &
...
./job_3n &
wait

This works most efficiently when all jobs have the same or almost the same run time. If run times for jobs are unequal, then n jobs are run simultaneously and the cpus remain idle until all n jobs are completed before looping through the next n jobs. This will lead to idle time and inefficient consumption of cpu time.

GNU Parallel solves this issue by first launching n jobs. When one job completes, then the next job in sequence is started. This permits efficient use of cpu time by reducing the wait time and letting a number of small jobs to run while some cpus work on longer jobs.

parallel job_{1} ::: $(seq 1 3n)

Single Node example using a LAMMPS benchmark run

The following example is run using a interactive session. However, you should be able to run this via a SLURM script

  [2018-03-12 09:19.54] ~
  [alp514.sol](1002): interact -p test -n 36
  [2018-03-12 09:19.57] ~
  [alp514.sol-e601](1001): module load lammps
  [2018-03-12 09:20.01] ~
  [alp514.sol-e601](1002): module load parallel
  [2018-03-12 09:20.07] ~
  [alp514.sol-e601](1003): cd /share/Apps/examples/parallel/
  [2018-03-12 09:20.13] /share/Apps/examples/parallel
  [alp514.sol-e601](1004): time parallel 'srun -n 1 $(which lammps) -in in.lj -var n {1} -var x {2} -log log.lammps-{1}-{2} -sc none' ::: $(seq 5 5 100) ::: $(seq 1 6)
  
  real    4m8.378s
  user    0m1.391s
  sys     0m1.787s
  [2018-03-12 09:24.51] /share/Apps/examples/parallel
  [alp514.sol-e601](1005): time parallel 'srun -n 1 $(which lammps) -in in.lj -var n {1} -var x {2} -log log.lammps- {1}-{2} -sc none' ::: $(seq 100 -5 5) ::: $(seq 6 -1 1)
  
  real    3m47.091s
  user    0m1.391s
  sys     0m1.830s