Table of contents
This page is written by Eric Verner.
Docker
Docker is the most popular containerization technology in use today. Docker offers OS-level virtualization, allowing developers to package libraries, code, and the operating system together in one image, which can be programmatically created on the fly. This means that software will run the same on any platform that can run Docker, enabling users to easily share code with others. Containerization has been impactful in the neuroimaging community, as neuroimaging pipelines often require many libraries that can be difficult to install on one’s computer. BIDS apps and Neurodocker are good examples of Docker images made for neuroimaging. Docker is also commonly used with container orchestration technologies such as docker-compose and Kubernetes.
Caveats
- Docker is only allowed on development nodes (currently
trendscn017.rs.gsu.edu
andtrendsgndev101.rs.gsu.edu
, but check Cluster_queue_information for updates). Because operations are run as root inside of a container, Docker is seen as a security vulnerability and is often not allowed on high-performance computing (HPC) systems. To run a Docker container using Slurm, you must convert it to a Singularity image first. See the section below on building Singularity images to learn how to do this.
- Docker runs on Linux, Mac, and Windows Professional but requires additional work to install on Windows.
How to run a Docker image on the cluster?
Docker images must be either pulled from another source, such as DockerHub, or built locally (see next section). After the image has been built, you can find it by typing
docker image ls
To run a command inside a container, use a docker run
command:
docker run -it myimage bash
This command opens up an interactive bash session into the container. Note that docker run
creates a container, which can be thought of as a live instance of an image. The terms “container” and “image” are often used interchangeably, though.
If a Docker container has been started with a persistent process, like a server, then it can be accessed using a docker exec command.
How to build a Docker image on the cluster
Docker images are built using Dockerfiles, which can be thought of as recipes. Please see this page for more information about how to use Dockerfiles. An example build command is
docker build . -t myimage:latest
This builds a Docker image using the code in the current working directory containing a Dockerfile and tagging it with the name myimage:latest
.
How to share data with a Docker container
Docker uses volumes to share data between the host system and the container. To share a folder with a container, use a command like this:
docker run -v /folder/on/host:/folder/in/container myimage -it bash
To share multiple folders, use multiple -v
flags.
docker run \
-v /first/folder/on/host:/first/folder/in/container \
-v /second/folder/on/host:/second/folder/in/container \
myimage -it bash
GPU usage
It is possible to use a host system’s GPU inside a Docker container. The Docker image must be built using an nvidia/cuda image as its base image. Note that the version of CUDA on the host should match that inside the image. Actually, different versions may be compatible, but the relationship is complicated, so it is easier to use the same versions when possible. After you have built your image, you can run a command like so
docker run --gpus all nvidia/cuda:10.2-base nvidia-smi
This command runs nvidia-smi
on the cuda:10.2-base
image. The --gpus
flag must be used to expose the host system’s GPUs inside the container, and --gpus all
allows the container to use all the GPUs on the host system. For more information, see the official NVIDIA Docker GitHub page.
Dockerfile recipes examples
If one wants to build from Dockerfile recipes on the cluster, one will need to be added to the docker group permissions via a hydra ticket. Dockerfile recipes can be created in numerous different ways. However, something one should consider when buildering their docker recipe is whether or not it will successfully build a year from now, which will largely be dependent on whether or not the Dockerfile developer uses dependency version numbers in their build scripts wherever expediant to do so.
In other words, try to never use this:
pip install torch
Instead, at minimum use the following
pip install torch==version
However, one will quickly recognize that this is tedious. Luckily, there is environment.yml files if one is using a conda environment.
One may export an environment file with
conda env export > environment.yml
One may in general create a conda environment from a yml file as follows:
conda env create -f environment.yml
One can take advantage of this functionality inside of a Dockerfile. The following website shows how to do this https://pythonspeed.com/articles/activate-conda-dockerfile/ Howevever, the above link will not give you gpu support. An example Dockerfile that is gpu ready is available from neuroneural’s extension of topofit.
https://github.com/neuroneural/topofit/tree/main/docker
Inside of this directory is a Dockerfile, environment.yml file and Readme.md file with more information on how to build.
For the basics of docker recipes, the following readme is fairly good. https://biocorecrg.github.io/CoursesCRG_Containers_Nextflow_May_2021/docker-recipes.html
Additional, in house examples of docker and singularity may be availabe at
https://github.com/neuroneural/trdops
and
https://github.com/trendscenter/coinstac-enigma-sans/blob/master/Dockerfile
Singularity
Singularity is a containerization technology that is friendly for HPC systems. Commands can be run inside Singularity as a non-root user. This has made it a popular choice at universities and within the neuroimaging community.
How to run a Singularity container on the cluster
You can interact with a Singularity image using the exec
, run
, or shell
commands.
To open up a shell into a container, use singularity shell
.
To run a Singularity container using its run script, use singularity run
To run any other command inside a Singularity container, use singularity exec
See this page for more details.
How to build a Singularity image on the cluster
Singularity images can be built from scratch using a definition file or can be built from existing Docker or Singularity images from a local or remote library, such as Singularity Hub or Docker Hub. They can also be built from locally cached Docker images. For more information, see this page.
For example, to build from the lolcow
image from the Singularity Library, use this command:
singularity pull lolcow.sif library://sylabs-jms/testing/lolcow
To build from Docker Hub, use this command:
singularity build lolcow.sif docker://godlovedc/lolcow
If pulling from a private Docker Hub repository, you must enter your Docker credentials, either interactively or with environment variables (see here).
To build from a locally cached Docker image, use this command:
singularity build lolcow.sif docker-daemon://godlovedc/lolcow
Note that Docker and Singularity must be available on the same system for this to work.
Caveats
- Several users have had problems building Singularity images on the cluster. The current recommended approach is to build the image on the dev node (
trendscn017
) inside of the/data/singtmp/
directory. Also, use a local cache to build the container, whereusername
is your username, as shown below. This will build the container inside of/data/singtmp/username
ssh trendscn017.rs.gsu.edu
cd /data/singtmp
mkdir -p username/cache
cd username
SINGULARITY_CACHEDIR=/data/singtmp/username/cache
module load SysTools/Singularity3.5.2
singularity pull ...
- Singularity only runs natively on Linux. Singularity Desktop can run on Mac OS, but it lacks some features such as building from Docker Hub. Singularity can be run using a Linux VM on Mac or Windows, though (see here).
How to share data with a Singularity container
Singularity lets you share folders from the host system with the container using the --bind
flag. In the example below, the /opt
folder on the host is bound to the same folder inside the container, and the /data
folder on the host system is bound to the /mnt
folder in the container called my_container.sif
.
singularity shell --bind /opt,/data:/mnt my_container.sif
This also works for run
and exec
commands. Any changes made inside the container during this command will persist on the host system after the command is complete, so be careful.
Singularity also mounts your home folder by default, which may lead to surprising results. Use the --containall
flag to stop this from happening. For example,
singularity shell --bind /opt,/data:/mnt --containall my_container.sif
See this page for more details.
GPU usage
Singularity containers can utilize GPUs on the host system. The --nv
flag must be used with a container that has the Nvidia runtime installed, as shown below.
singularity pull docker://tensorflow/tensorflow:latest-gpu
singularity run --nv tensorflow_latest-gpu.sif
See this page for more details on GPU support.
Singularity recipes
Currently, trends doesn’t support local builds for Singularity file recipes. You will get Fatal permission errors if you try.
$ singularity build toposing.sif Singularity
FATAL: You must be the root user, however you can use --remote or --fakeroot to build from a Singularity recipe file
remote builds show promise, however there may be some limitations, so I leave it to someone else to complete this section since I built recipes with docker, then built docker images with singularity into singularity images.