Singularity
Singularity is a container platform. It allows you to create and run containers that package up pieces of software in a way that is portable and reproducible. You can build a container using Singularity on your laptop, and then run it on many of the largest HPC clusters in the world, local university or company clusters, a single server, in the cloud, or on a workstation down the hall.
Singularity Image Format file
Singularity uses Singularity Image Format (.sif
) files to run containers. sif
files can be built through two processes:
- Downloading pre-built images
- Building images from scratch
Building images from scratch require root access and since we do not have root access at NINA we need to rely on downloading pre-built images to build our .sif
files.
Downloading pre-built images as .sif
files
It is possible to download images from the public image registries using Singularity using singularity pull
. For instance, you can pull the latest official python image from docker hub on saga using the command:
singularity pull docker://python
The docker://
uri is used to reference Docker images served from a registry. In this case pull
does not just download an image file. Docker images are stored in layers, so pull
combines those layers into a usable Singularity file.
I make sure that the images has been pulled:
[bencretois@login-1.SAGA ~]$ ls
python_latest.sif
Downloading your own custom image
In most case you will want to use an image that you built so that the depencies required to run your custom software are already specified and installed. Since we do not have root access at NINA we follow the following workflow:
- Specify a
Dockerfile
- Build the docker image
- Push the docker image in a registry
- Pull the image as a
.sif
file from a HPC cluster
Below we describe and provide an example for each step.
1. Specify a Dockerfile
We first write a Dockerfile
containing all we need to run our software. In this case, we want to train a cat and dog picture classifier and we need to write the Dockerfile
accordingly.
FROM python:3.8
ARG DEBIAN_FRONTEND=noninteractive
RUN pip3 install poetry
WORKDIR /app
COPY pyproject.toml poetry.lock ./
RUN poetry config virtualenvs.create false
RUN poetry install --no-root
COPY . ./
ENV PYTHONPATH "${PYTHONPATH}:/app/"
In this Dockerfile
we first use the official image for python.3.8.
Then we install poetry (which is the python package manager) using pip
. We recommand poetry over anaconda as we ran into some problems running anaconda with Docker.
We set the working directory of the container as /app
We copy both pyproject.toml
and poetry.lock
in the container so that poetry
knows which packages to install
We install the packages necessary to run our machine learning experiment
And finally we specify the PYTHONPATH
so that scripts that are in certain folders can read the scripts which are stored in other folders.
2. Build the docker image
Building the docker image implies that docker
is installed on your system. At NINA it is possible to install Docker Desktop to use Docker on the remote server. Please contact Datahjelp which can assist you setting up either Docker Desktop or the VDI .
Once you have access to docker
you can build your custom image using the command:
docker build -t ml_image -f Dockerfile .
-t
stands for "tag", which is the name you want to give to the image.
-f
stands for "file" and takes the Dockerfile
as input.
3. Push the docker image in a registry
It is possible to push your custom image directly in the GitLab or GitHub registry.
Pushing the image on GitLab registry
Pushing the image on the Gitlab registry requires less manual configuration and we give an example on how to do it below.
Provided that you have a GitLab account and a GitLab project (in our example the project is called ml_image
) for your specific task, the image should be rename as:
registry.gitlab.com/nina-data/ml_image:latest
We rename the image to provide an url to the registry where the image should be stored. Once the image has been pushed, you can check GitLab project -> Container registry
Pushing the image on GitHub registry
In your project repository create a folder .github/workflows
and create a file publish_image.yml
containing the following code:
name: Create and publish a Docker image
on:
push:
branches: main
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-push-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Log in to the Container registry
uses: docker/login-action@f054a8b539a109f9f41c372932f1ae047eff08c9
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@98669ae865ea3cffbcbaa878cf57c20bbf1c6c38
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push Docker image
uses: docker/build-push-action@ad44023a93711e3deb337508980b4b5e9bcdc5dc
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
Once the folder and file have been created simply push the .github
to the project's GitHub repository and GitHub Actions will take care of building and hosting the image.
Note that you will find the image in Packages
(right sidebar of GitHub) and that you will need to make your image public before being able to pull it on Sigma2.
4. Pull the image as a .sif
file from a HPC cluster
To pull the image that is stored on Gitlab, Docker hub or GitHub registry simply run:
singularity pull docker://registry/name_of_your_image
For instance, to pull the image from this repository (which is hosted on GitHub):
singularity pull docker://ghcr.io/ninanor/91126800_ml_and_associated_tech:main