Thursday, June 2, 2022

Docker 101 - Dockerfile - FROM, RUN, ADD, COPY and WORKDIR

   


What is Dockerfile?



Docker can build images automatically by reading the instructions from a Dockerfile. A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Reference


Format




Ex:
# Comment
  INSTRUCTION arguments

* Comment lines are removed before the Dockerfile instructions are executed.


Dockerfile Instructions - FROM



The FROM instruction initializes a new build stage and sets the Base Image for subsequent instructions. As such, a valid Dockerfile must start with a FROM instruction (ARG is the only instruction that may precede FROM in the Dockerfile). Reference

Rule to pick Image:

*Official images are better than non-official images for security concern.

*Pulling the specific version is better than the latest version 
   (App will be malfunctional once some breaking changes get introduced from new version)

    For example, it is better to use python:3.9.13 instead of python (python:latest)

*Pick smallest size of image.

    For example, it is better to use python:3.9.13-alpine instead of python: 3.9.13

Ex:
$ docker image ls

Result:
REPOSITORY TAG IMAGE ID CREATED SIZE python 3.9.13 9fa3494cf8c7 2 days ago 915MB python latest 675cf548c64d 2 days ago 920MB python 3.9.13-alpine 7dd61a9a1c2d 5 days ago 47.4MB


Dockerfile Instructions - RUN



The RUN instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile.

RUN has 2 forms: Reference

*shell form (the command is run in a shell, which by default is /bin/sh -c on Linux):
    RUN <command>

*exec form:
    RUN ["executable", "param1", "param2"]


Try to reduce the number of layers in your image by minimizing the number of separate RUN commands in your Dockerfile.

For example, if we have a Dockerfile which contains multiple RUN commands.

Dockerfile:
FROM ubuntu:22.04 RUN apt update RUN apt install -y net-tools RUN apt install -y wget RUN apt install -y git RUN apt install -y vim RUN apt install -y openssh-server

Ex:
$ docker image build -t multiple-run-exp .

Result:
Successfully built 52817e1362a5
  Successfully tagged multiple-run-exp:latest

On the other hand, we have another Dockerfile which only have one RUN command.

Dockerfile:
FROM ubuntu:22.04   RUN apt update && \       apt install -y net-tools && \       apt install -y wget && \       apt install -y git && \       apt install -y vim && \ apt install -y openssh-server

Ex:
$ docker image build -t single-run-exp .

Result:
Successfully built 60a1a813ad21
  Successfully tagged single-run-exp:latest

Let's check the images size.

Ex:
$ docker image ls

Result:
REPOSITORY TAG IMAGE ID CREATED SIZE multiple-run-exp latest 52817e1362a5 42 seconds ago 341MB single-run-exp latest 60a1a813ad21 6 minutes ago 336MB ubuntu 22.04 d2e4e1f51132 4 weeks ago 77.8MB

The image size with single one RUN command is less than the image size with multiple RUN commands.

Then let's check the image history separately.

Ex:
$ docker image history multiple-run-exp

Result:
IMAGE CREATED CREATED BY
52817e1362a5 26 minutes ago /bin/sh -c apt install -y openssh-server
78fcced312db 27 minutes ago /bin/sh -c apt install -y vim 59110cdfa972 27 minutes ago /bin/sh -c apt install -y git
42b7d50f9786 28 minutes ago /bin/sh -c apt install -y wget
d79c69e868b5 28 minutes ago /bin/sh -c apt install -y net-tools a7397c58e603 28 minutes ago /bin/sh -c apt update
d2e4e1f51132 4 weeks ago /bin/sh -c #(nop) CMD ["bash"]
<missing> 4 weeks ago /bin/sh -c #(nop) ADD file:37744639…

SIZE COMMENT
85.8MB
59.1MB
72.8MB
9.67MB
1.4MB
34MB
0B
77.8MB

There are six layers added on top of based Image.

Ex:
$ docker image history single-run-exp

Result:
IMAGE CREATED CREATED BY
60a1a813ad21 31 minutes ago /bin/sh -c apt update && apt install -y …
d2e4e1f51132 4 weeks ago /bin/sh -c #(nop) CMD ["bash"]
<missing> 4 weeks ago /bin/sh -c #(nop) ADD file:377446398c8…

SIZE COMMENT
258MB
0B
77.8MB

There is only one layer added on top of based Image.


Dockerfile Instructions - ADD V.S. COPY



The ADD instruction copies new files, directories or remote file URLs from <src> and adds them to the filesystem of the image at the path <dest>.

The COPY instruction copies new files or directories from <src> and adds them to the filesystem of the container at the path <dest>.

There are almost the same, but ADD support unpacked the tar archive.

*If <src> is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory.


Let's use ADD instruction for tar archive as the following.


Dockerfile:
FROM python:alpine3.16
  ADD hello-world.tar.gz /app/

Ex:
$ docker image build -t python-add-gz .

Result:
Sending build context to Docker daemon  3.072kB
  Step 1/2 : FROM python:alpine3.16
  ---> b22cfbf3bfa6
  Step 2/2 : ADD hello-world.tar.gz /app/
  ---> dce2fa976707
  Successfully built dce2fa976707
  Successfully tagged python-add-gz:latest

Ex:
$ docker container run -it python-add-gz sh
  / # ls -l /app/

Result:
-rw-rw-r--    1 1000     1000            21 Jun  7 10:27 hello-world.py

Through sh, we can see that hello-world.tar.gz was extracted.


On the other hand, let's use COPY instruction for tar archive.


Dockerfile:
FROM python:alpine3.16
  COPY hello-world.tar.gz /app/

Ex:
$ docker image build -t python-copy-gz .

Result:
Sending build context to Docker daemon  3.072kB
  Step 1/2 : FROM python:alpine3.16
  ---> b22cfbf3bfa6
  Step 2/2 : COPY hello-world.tar.gz /app/
  ---> 6c29dc03dbe2
  Successfully built 6c29dc03dbe2
  Successfully tagged python-copy-gz:latest

Ex:
$ docker container run -it python-copy-gz sh
  / # ls -l /app/

Result:
-rw-rw-r--    1 root     root         150 Jun  7 10:35 hello-world.tar.gz

Through sh, we can see that hello-world.tar.gz was not extracted by COPY instruction.


Dockerfile Instructions - WORKDIR



The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY and ADD instructions that follow it in the Dockerfile. If the WORKDIR doesn’t exist, it will be created.

If not specified, the default working directory is /. In practice, if you aren’t building a Dockerfile from scratch (FROM scratch), the WORKDIR may likely be set by the base image you’re using. Therefore, to avoid unintended operations in unknown directories, it is best practice to set your WORKDIR explicitly.

Dockerfile:
FROM python:alpine3.16
  WORKDIR /app
  ADD hello-world.py hello-world.py

Ex:
$ docker image build -t workdir-exp .

Result:
Sending build context to Docker daemon  3.072kB
  Step 1/3 : FROM python:alpine3.16
  ---> b22cfbf3bfa6
  Step 2/3 : WORKDIR /app
  ---> Running in 39846f21a051
  Removing intermediate container 39846f21a051
  ---> c9326323ce79
  Step 3/3 : ADD hello-world.py hello-world.py
  ---> f92ce9b09b4c
  Successfully built f92ce9b09b4c
  Successfully tagged workdir-exp:latest

Ex:
$ docker container run -it workdir-exp sh
  /app # ls

Result:
-rw-rw-r--    1 root     root            21 Jun  7 11:09 hello-world.py

As we can see, the /app directory was created and set as current working directory.