Monday, November 27, 2023

[SOLVED] Using Bash script to launch Python, store its PID and return success upon its completion

November 27, 2023 bash, linux, python, terminal

Issue

i am trying to create a small job scheduler for submitting python job so I am using Bash to record its PID and create a while loop to test if the process has completed and return completed status after its indeed finished.

I created dummy python script called dummy_python.py:

import time
for i in range(10):
    print("hello world")
    time.sleep(2)

And the Bash to submit and record its PID and return success after its completing:

#! /bin/bash

job=`python dummy_python.py | ps aux | grep dummy_python.py | awk '{print $2}'`
echo "$job" 
echo "starting"
while true
do
sleep 1

top | grep $job > /dev/null

if [[ $? -ne 0 ]];then

        echo "job completed"
        break;

fi

done

So I am starting python code, taking its PID and storing in variable job and then checking if the status of the job is completed by performing top | grep .. . The python job does get submittd but I get:

    Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe

14025
14027
starting
grep: 14027: No such file or directory
job completed

dont see hello world being printed and I get to PIDs in the terminal after its completing. What is happening and how can it be fixed?

Solution

I am starting python code, taking its PID and storing in variable job and then checking if the status of the job is completed by performing top | grep .. .

That may be what you intend, but it's not what you're doing.

job=`python dummy_python.py | ps aux | grep dummy_python.py | awk '{print $2}'`

The command inside the backticks (`` characters) is a pipeline; it connects the output of one program to the input of the next. Python's output goes to ps aux , ps's output goes to grep, and grep's output goes to awk.

Within backticks `...`, or alternately the more universal and preferred $( ... ), is called command substitution, and it means the command ... will be executed and its position in the script will be replaced by the output of the command .... This implies that the command must run to completion before the substitution can be made.

So what you're doing in that line is executing your python script, sending it's output to ps aux, sending ps's output to grep, sending grep's output to awk, and then, once the pipeline finishes executing, storing awk's output in job.

ps aux | grep dummy_python.py | awk '{print $2}' is a logical construction; list processes, grep for the process matching the regex dummy_python.py, and then print the second column. But ...

dont see hello world being printed

python dummy.py|ps aux doesn't really make sense; ps does not read from standard input which is why python complains BrokenPipeError: [Errno 32] Broken pipe; it's trying to write hello world to its standard output, which is connected to ps's standard input, but ps doesn't read from the standard input. When ps ends, the pipe is broken.

As described in the comments, top is good only for interactively monitoring the most active processes on the computer. ps is the correct choice when you want to use the list of running processes in a script.

top | grep $job > /dev/null

Remember, job is the output of ps aux | grep dummy_python.py | awk '{print $2}'. Even though your python script's output is discarded, otherwise that pipeline is correctly listing processes, filtering out the lines containing dummy_python.py and selecting it's Process ID. so $job becomes the process ID of the python script, when it was running. Recall the script must end before the command substitution can be assigned to job.

I get t[w]o PIDs in the terminal after its completing.

Becuase you're running 2 commands that contain the string dummy_python.py: python dummy_python.py and grep dummy_python.py.

top | grep $job > /dev/null

$job is replaced by 14025 14027. It is not in quotes so it becomes two arguments to grep. If you look at the way grep works in its man page:

       grep [OPTION...] PATTERNS [FILE...]

14025 is becoming the PATTERNS for which it searches, and 14027 is interpreted as FILE. That's why you get

grep: 14027: No such file or directory

how can it be fixed?

First, you should understand that shell scripting is not easy. It's actually quite difficult to do correctly and there's many pitfalls. python can do everything bash can do; stick to higher level languages if you don't want to learn shell. If you want to script in bash you really must learn about how it works; its syntax is comparatively arcane and you cannot guess your way through it.

Unlike python, bash will not automatically abort the script on error; it will continue to execute even if commands end in an exit status. You need to invoke bash -e or set -e if you want errors to result in script abortion.

I get t[w]o PIDs in the terminal

Because the grep pattern matches itself.

The easiest solution for this particular problem is to make the grep command not match its own pattern; one way is to turn the pattern into an equivalent pattern that doesn't match itself (eg [d]ummy_python.py matches dummy_python.py but does not match itself.)

But you don't need to do any of the stuff you are doing in the script. You need to:

run the script in the background
wait until the script completes
print job completed.

Bash has built-ins to do all of this stuff. you don't need to search for the PID; $! provides it. You don't have to poll for the script to finish; bash built-in wait handles that more succinctly and efficiently. You do need to run python in the background if you want the script to proceed before it ends; & instead of ; or newline achieves that through bash's built-in job control.

#!/bin/bash -e
python -c 'from time import sleep; print ("hello world"); sleep(2); print("goodbye world")' &
job=$!
echo "job started, pid $job"
wait $job
echo "job completed"

Answered By - erik258

Answer Checked By - Senaida (WPSolving Volunteer)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, November 27, 2023

[SOLVED] Using Bash script to launch Python, store its PID and return success upon its completion

Issue

Solution

Popular Posts

Labels