Issue
I'm trying to reproduce results of an older research paper and need tp run a singularity container with nvidia CUDA 9.0 and torch 1.2.0.
Locally I have Ubuntu 20.04 as VM where I run singularity build
. I follow the guide to installing older CUDA versions.
This is the recipe file
#header
Bootstrap: docker
From: nvidia/cuda:9.0-runtime-ubuntu16.04
#Sections
%files
/home/timaie/rkn_tcml/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
/home/timaie/rkn_tcml/RKN/*
%post
# necessary dependencies
pip install numpy scipy scikit-learn biopython pandas
dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
apt-get autoclean
apt-get autoremove
apt-get update
export CUDA_HOME="/usr/local/cuda-9.0"
export TORCH_EXTENSIONS_DIR="$PWD/tmp"
export PYTHONPATH=$PWD:$PYTHONPATH
%runscript
cd experiments
python train_scop.py --pooling max --embedding blosum62 --kmer-size 14 --alternating --sigma 0.4 --tfid 0
which runs fine and gets me an image.simg file. Then I try installing cuda through sudo singularity exec image.simg apt-get install cuda
producing the following error
0 upgraded, 823 newly installed, 0 to remove and 1 not upgraded.
Need to get 2661 MB of archives.
After this operation, 6822 MB of additional disk space will be used.
W: Not using locking for read only lock file /var/lib/dpkg/lock-frontend
W: Not using locking for read only lock file /var/lib/dpkg/lock
W: chown to _apt:root of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: chmod 0700 of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: Not using locking for read only lock file /var/cache/apt/archives/lock
E: You don't have enough free space in /var/cache/apt/archives/.
I read about a similar issue in docker here but I don't know of something similar to docker system prune
for Singularity.
I also tried freeing space through apt autoremove
and apt autoclean
without any success.
There should be enough space left on disk as running df -H
gives
Filesystem Size Used Avail Use% Mounted on
udev 2,1G 0 2,1G 0% /dev
tmpfs 412M 1,4M 411M 1% /run
/dev/sda5 54G 19G 33G 36% /
tmpfs 2,1G 0 2,1G 0% /dev/shm
tmpfs 5,3M 4,1k 5,3M 1% /run/lock
tmpfs 2,1G 0 2,1G 0% /sys/fs/cgroup
/dev/loop0 132k 132k 0 100% /snap/bare/5
/dev/loop1 66M 66M 0 100% /snap/core20/1328
/dev/loop2 261M 261M 0 100% /snap/gnome-3-38-2004/99
/dev/loop3 66M 66M 0 100% /snap/core20/1405
/dev/loop4 69M 69M 0 100% /snap/gtk-common-themes/1519
/dev/loop5 46M 46M 0 100% /snap/snapd/15177
/dev/loop6 57M 57M 0 100% /snap/snap-store/558
/dev/loop7 46M 46M 0 100% /snap/snapd/14978
/dev/sda1 536M 4,1k 536M 1% /boot/efi
tmpfs 412M 25k 412M 1% /run/user/1000
Does anyone know if the problem resides on my local Ubuntu, or with the nvidia docker image?
Thanks for any clarification.
Solution
As described in overview section of singularity build
documentation
build can produce containers in two different formats that can be specified as follows.
- compressed read-only Singularity Image File (SIF) format suitable for production (default)
- writable (ch)root directory called a sandbox for interactive development (
--sandbox
option)
Adding --sandbox
should make the system files writable which should resolve your issue.
Ideally, I'd suggest adding any apt-get install
commands to the %post
section in your recipe file.
Answered By - Emil Vatai Answer Checked By - Mildred Charles (WPSolving Admin)