Issue
I have a list of configuration files:
cfg1.cfg
cfg2.cfg
cfg3.cfg
cfg4.cfg
cfg5.cfg
cfg6.cfg
cfg7.cfg
...
that serve as input for two scripts:
script1.sh
script2.sh
which I run sequentially as follows:
script1.sh cfgX.cfg && script2.sh cfgX.cfg
where X=1, 2, 3, ...
These scripts are not parallelised and take a long time to run. How can I launch them in parallel, let's say 4 at the time, so I do not kill the server where I run them?
For just one script I tried a brute force approach similar to:
export COUNTER_LIMIT=4
export COUNTER=1
for each in $(ls *.cfg)
do
INSTRUCTION="./script1.sh $each "
if (($COUNTER >= $COUNTER_LIMIT)) ;
then
$INSTRUCTION &&
export COUNTER=$(($COUNTER-$COUNTER_LIMIT));
echo
sleep 600s
else
$INSTRUCTION &
sleep 5s
fi
echo $COUNTER
export COUNTER=$(($COUNTER+1));
done
(the sleeps are because for some reason the scripts cannot be initiated at the same time...)
So, ho can I do so that the double ampersands in
script1.sh cfgX.cfg && script2.sh cfgX.cfg
dont' block the brute force parallelisation?
I also accept better and simpler approaches ;)
Cheers jorge
UPDATE
I should have mentioned that the config files are not necessarily sequentially named and can have any name, I just made them like this to make the example as simple as possible.
Solution
parallel --jobs 4 \
--load 50% \
--bar \
--eta "( echo 1st-for-{}; echo 2nd-for-{} )" < aListOfAdHocArguments.txt
0% 0:5=0s
1st-for-Abraca
2nd-for-Abraca
20% 1:4=0s
1st-for-Dabra
2nd-for-Dabra
40% 2:3=0s
1st-for-Hergot
2nd-for-Hergot
60% 3:2=0s
1st-for-Fagot
2nd-for-Fagot
80% 4:1=0s
100% 5:0=0s
Q : How can I launch them in parallel, let's say 4 at the time, so I do not kill the server where I run them?
A lovely task for GNU parallel
.
First let's check the localhost ecosystem ( exosystems, executing parallel
-jobs over ssh
-connected remote-hosts possible, yet exceed the scope of this post ) :
parallel --number-of-cpus
parallel --number-of-cores
parallel --show-limits
For more configuration details beyond the --jobs 4
, potentially --memfree
or --noswap
, --load <max-load>
or --keep-order
and --results <aFile>
or --output-as-files
:
man parallel
parallel --jobs 4 \
--bar \
--eta "( script1.sh cfg{}.cfg; script2.sh cfg{}.cfg )" ::: {1..123}
Here,
emulated by a just pair of tandem echo
-s for down-counted indexes, so progress-bars are invisible and Estimated-Time-of-Arrival --eta
indications are almost instant... :
parallel --jobs 4 \
--load 50% \
--bar \
--eta "( echo 1st-for-cfg-{}; echo 2nd-for-cfg-{} )" ::: {10..0}
0% 0:11=0s 7
1st-for-cfg-10
2nd-for-cfg-10
9% 1:10=0s 6
1st-for-cfg-9
2nd-for-cfg-9
18% 2:9=0s 5
1st-for-cfg-8
2nd-for-cfg-8
27% 3:8=0s 4
1st-for-cfg-7
2nd-for-cfg-7
36% 4:7=0s 3
1st-for-cfg-6
2nd-for-cfg-6
45% 5:6=0s 2
1st-for-cfg-5
2nd-for-cfg-5
54% 6:5=0s 1
1st-for-cfg-4
2nd-for-cfg-4
63% 7:4=0s 0
1st-for-cfg-3
2nd-for-cfg-3
72% 8:3=0s 0
1st-for-cfg-2
2nd-for-cfg-2
81% 9:2=0s 0
1st-for-cfg-1
2nd-for-cfg-1
90% 10:1=0s 0
1st-for-cfg-0
2nd-for-cfg-0
Update
You added:
I should have mentioned that the config files are not necessarily sequentially named and can have any name, I just made them like this to make the example as simple as possible.
The < list_of_arguments
solves this ex-post changed problem definition:
parallel [options] [command [arguments]] < list_of_arguments
Answered By - user3666197 Answer Checked By - Mary Flores (WPSolving Volunteer)