Issue
I am pretty new to building containers with Docker, I commonly use conda environments for my day to day work, however this time I needed to work with a computation server that only allows running docker containers. I want to build an image that will allow me to run my Pytorch code. What I prepared is like the following, a pretty common Dockerfile for deep learning applications:
FROM nvidia/cuda:12.2.0-devel-ubuntu20.04
CMD ["bash"]
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV SHELL=/bin/bash
RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
&& apt-get -y install --no-install-recommends \
git \
wget \
cmake \
ninja-build \
build-essential \
python3 \
python3-dev \
python3-pip \
python3-venv \
python-is-python3 \
&& apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/*
RUN apt-get install sqlite3
ENV VIRTUAL_ENV=/opt/python3/venv/base
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
RUN python3 -m pip install --upgrade pip
RUN pip install jupyterlab
RUN python3 -m pip install pandas
RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
COPY entry_point.sh /entry_point.sh
RUN chmod +x /entry_point.sh
# Set entrypoint to bash
ENTRYPOINT ["/entry_point.sh"]
When building the container, I get the following error:
E: Unable to locate package sqlite3
When I remove the sqlite3 installation line, the image builds and when I run the corresponding container and try to install sqlite again with the same command in CLI, I get the same error. I am using as the base image "nvidia/cuda:12.2.0-devel-ubuntu20.04" that is supposed to bring Ubuntu20.04 environment along with Cuda, however the apt package manager cannot seem to find a very common tool that is Sqlite. I also make an apt-get update
call in the Dockerfile right in the beginning. I am not seeing what is missing here unfortunately. Should I use an another base container perhaps?
Solution
In you Dockerfile, you
RUN apt-get update \
&& ...
&& apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/*
RUN apt-get install sqlite3
The end of the first RUN
command cleans up all of APT's state, so when the second RUN apt-get install
happens, it's forgotten that any packages exist.
In the first RUN
command, you have a list of packages you install. Add sqlite3
into that list.
RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
&& apt-get -y install --no-install-recommends \
...
python-is-python3 \
sqlite3 \ # <-- move up into this list
&& apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/*
An incremental rebuild will be slower, but if you (or your CI system) builds the image from scratch, it will be somewhat faster to only invoke the APT/dpkg machinery once.
If you do need it as a separate RUN
line for whatever reason, you'll need to repeat the apt-get update
inside the second RUN
command. Copying the overall structure of the first RUN
command would make sense, just using a different package list.
Answered By - David Maze Answer Checked By - Katrina (WPSolving Volunteer)