Setting up the environment
Virtual environments versus containers
Two choices are offered to set up the environment: virtual environments and containers. We will demonstrate how we can train the cat / dog classifier using both methods.
Conda setup
WORK IN PROGRESS
Container setup
On the other hand it might be a good idea to set up a container. Because we have not root access on SIGMA2's HPC we have to build a Singularity container by first having a Docker container which is specified using a Dockerfile. Let's analyse the Dockerfile of this specific case study line by line:
- First we install python version 3.8 in our container so we can use up to date python libraries
FROM python:3.8
- Avoid the system asking questions / dialogs during the
apt-get
install
ARG DEBIAN_FRONTEND=noninteractive
- Update
apt-get
to have up to date packages
RUN \
apt-get update && \
rm -rf /var/lib/apt/lists/*
- Install the package manager
poetry
. Note that you can install your preferred package manager such asconda
in this step.
RUN pip3 install poetry
- Set the working directory of the container
WORKDIR /app
- The next three lines are specific to
poetry
. Basically we copy both thepyproject.toml
(file containing the packages we use for our analysis) andpoetry.lock
(file containing all the dependancies). Then we remove the creation of the virutal environment so thatpython
in our container uses all our package without opening a virtual environment. Finally we install our packages.
COPY pyproject.toml poetry.lock ./
RUN poetry config virtualenvs.create false
RUN poetry install --no-root
- Copy all the files of the folder where we open the container
COPY . ./
- Set the python path to the working directory. This way the
main_scripts
can access the scripts in the other folders (for instance theutils
scripts).
ENV PYTHONPATH "${PYTHONPATH}:/app/"
2 - Creating the docker image
The Dockerfile
being defined we can now create our image (i.e. the "environment" in which we will run the training script). For open a terminal, move to the folder where your Dockerfile
is located and write the following command (change case_study_1
to the name of your folder):
docker run -t case_study_1:latest -f Dockerfile .
The command should output the following:
Sending build context to Docker daemon 244.7kB
Step 1/10 : FROM python:3.8
---> 271c1bcd4489
Step 2/10 : ARG DEBIAN_FRONTEND=noninteractive
---> Using cache
---> 0965e91032c6
Step 3/10 : RUN apt-get update && rm -rf /var/lib/apt/lists/*
---> Using cache
---> 02fa21122354
Step 4/10 : RUN pip3 install poetry
---> Using cache
---> 33bbd2c53863
Step 5/10 : WORKDIR /app
---> Using cache
---> 7720e687da9c
Step 6/10 : COPY pyproject.toml poetry.lock ./
---> Using cache
---> 345244b7ba43
Step 7/10 : RUN poetry config virtualenvs.create false
---> Using cache
---> 47271847855f
Step 8/10 : RUN poetry install --no-root
---> Using cache
---> 67e487ef8ae6
Step 9/10 : COPY . ./
---> c7656f1447fa
Step 10/10 : ENV PYTHONPATH "${PYTHONPATH}:/app/"
---> Running in 9f82e9b70fc5
Removing intermediate container 9f82e9b70fc5
---> b08815df7070
Successfully built b08815df7070
Successfully tagged case_study_1:latest
Indicating that the image has been successfully created.