Issue
I’m new to working on Linux. I apologize if this is a dumb question. Despite searching for more than a week, I was not able to derive a clear answer to my question.
I’m running a very long Python program on Nvidia CPUs. The output is several csv files. It takes long to compute the output, so I use nohup
to be able to exit the process.
Let’s say main.py
file is this
import numpy as p
import pandas as pd
if __name__ == ‘__main__’:
a = np.arange(1,1000)
data = a*2
filename = ‘results.csv’
output = pd.DataFrame(data, columns = [“Output”])
output.to_csv(filename)
The calculations for data
is more complicated, of course. I build a docker container, and run this program inside this container. When I use python main.py
for a smaller-sized example, there is no problem. It writes the csv files.
My question is this:
When I do
nohup python main.py &
, I check what’s going on withtail -f nohup.out
in the docker container, I get what it is doing at that time but I cannot exit it and let the execution run its course. It just stops there. How can I exit safely from the screen that comes withtail -f nohup.out
?I tried not checking the condition of the code and letting the code continue for two days, then I returned. The output of
tail -f nohup.out
indicated that the execution finished but csv files were nowhere to be seen. It is somehow bundled up insidenohup.out
or does it indicate something else is wrong?
Solution
If you're going to run this setup in a Docker container:
A Docker container runs only one process, as a foreground process; when that process exits the container completes. That process is almost always the script or server you're trying to run and not an interactive shell. But;
It's possible to use Docker constructs to run the container itself in the background, and collect its logs while it's running or after it completes.
A typical Dockerfile for a Python program like this might look like:
FROM python:3.10
# Create and use some directory; it can be anything, but do
# create _some_ directory.
WORKDIR /app
# Install Python dependencies as a separate step. Doing this first
# saves time if you repeat `docker build` without changing the
# requirements list.
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy in the rest of the application.
COPY . .
# Set the main container command to be the script.
CMD ["./main.py"]
The script should be executable (chmod +x main.py
on your host) and begin with a "shebang" line (#!/usr/bin/env python3
) so the system knows where to find the interpreter.
You will hear recommendations to use both CMD
and ENTRYPOINT
for the final line. It doesn't matter much to your immediate question. I prefer CMD
for two reasons: it's easier to launch an alternate command to debug your container (docker run --rm your-image ls -l
vs. docker run --rm --entrypoint ls your-image -l
), and there's a very useful pattern of using ENTRYPOINT
to do some initial setup (creating environment variables dynamically, running database migrations, ...) and then launching CMD
.
Having built the image, you can use the docker run -d
option to launch it in the background, and then run docker logs
to see what comes out of it.
# Build the image.
docker build -t long-python-program .
# Run it, in the background.
docker run -d --name run1 long-python-program
# Review its logs.
docker logs run1
If you're running this to produce files that need to be read back from the host, you need to mount a host directory into your container at the time you start it. You need to make a couple of changes to do this successfully.
In your code, you need to write the results somewhere different than your application code. You can't mount a host directory over the /app
directory since it will hide the code you're actually trying to run.
data_dir = os.getenv('DATA_DIR', 'data')
filename = os.path.join(data_dir, 'results.csv')
Optionally, in your Dockerfile, create this directory and set a pointer to it. Since my sample code gets its location from an environment variable you can again use any path you want.
# Create the data directory.
RUN mkdir /data
ENV DATA_DIR=/data
When you launch the container, the docker run -v
option mounts filesystems into the container. For this sort of output file you're looking for a bind mount that directly attaches a host directory to the container.
docker run -d --name run2 \
-v "$PWD/results:/data" \
long-python-program
In this example so far we haven't set the USER
of the program, and it will run as root. You can change the Dockerfile to set up an alternate USER
(which is good practice); you do not need to chown
anything except the data
directory to be owned by that user (leaving your code owned by root and not world-writeable is also good practice). If you do this, when you launch the container (on native Linux) you need to provide the host numeric user ID that can write to the host directory; you do not need to make other changes in the Dockerfile.
docker run -d --name run2 \
-u $(id -u) \
-v "$PWD/results:/data" \
long-python-program
Answered By - David Maze Answer Checked By - Dawn Plyler (WPSolving Volunteer)