Issue
While my original problem was solved in a different manner (see comment thread under this question, as well as the edits to this question), I was able to create a stack/LIFO for GNU Parallel in Bash. So I will edited my background/question to reflect a situation where it could be needed.
Background
I am using GNU Parallel to process files with a Bash script. As the files are processed, more files are created and new commands need to be added to parallel's list. I am not able to give parallel a complete list of commands, as information is generated as the initial files are processed.
I need a way to add the lines to parallel's list while it is running.
Parallel will also need to wait for a new line if nothing is in the queue and exit once the queue is finished.
Solution
First I created a fifo:
mkfifo /tmp/fifo
Next I created a bash file that cat's the file and pipes the output to parallel, which checks for the end_of_file line. (I wrote this with help from the accepted answer as well as from here)
#!/bin/bash
while true;
do
cat /tmp/fifo
done | parallel --ungroup --gnu --eof "end_of_file" "{}"
Then I write to the pipe with this command, adding lines to parallel's queue:
echo "command here" > /tmp/fifo
With this setup, all new commands are added to the queue. Once the queue is full parallel will begin processing it. This means that if you have slots for 32 jobs (32 processors), then you will need to add 32 jobs in order to start the queue.
If parallel is occupying all of its processors, it will put the job on hold until a processor becomes available.
By using the --ungroup
argument, parallel will process/output jobs as they are added to the queue once the queue is full.
Without the --ungroup
argument, parallel waits until a new slot is needed to complete a job. From the accepted answer:
Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.
Solution
There is a a small issue when using GNU parallel as queue system/batch manager: You have to submit JobSlot number of jobs before they will start, and after that you can submit one at a time, and job will start immediately if free slots are available. Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.
Answered By - Ole Tange Answer Checked By - Marilyn (WPSolving Volunteer)