Issue
I am trying to use slurm to run multiple commands in parallel on my cluster (single node). This is my situation:
Since every command requires a physical core, and M < N, I would like that, at most, only M commands are executed simultaneously.
The problem is that all the N commands are executed when I run sbatch
command. I tried to use --ntasks
parameter but with no success. Probably I am using the wrong SLURM parameters.
This is the file I am using:
############# file name: ./run_parallel_commands.sh #############
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --mem-per-cpu=1G
./command-1 &
./command-2 &
# ...
./command-N &
wait
And it is executed running:
$ sbatch ./run_parallel_commands.sh
Any suggestions? Thank you in advance.
Solution
You are almost there; the parameters you have should be fine. It's just the execution of the commands which needs work. You must execute the tasks using the srun
command.
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --mem-per-cpu=1G
srun ./command-1 &
srun ./command-2 &
# ...
srun ./command-N &
wait
Answered By - tomgalpin Answer Checked By - Dawn Plyler (WPSolving Volunteer)