Understanding Docker File

Understanding Docker File blog banner image

If you ask me what revolutionized the world trade, I would say containers. Surprised right but that is true. Containers actually revolutionized the way human beings transfer the goods from one place to other. It made the task so easy. Create the boxes of equal sizes according to requirements, that can be easily transported from train to ships to back to the train at the destination and then on trucks with great comfort. You may be amazed by knowing that world today is suffering from the shortage of such containers 😮 due to rise in demand after COVID.

Why am I talking about all this? Today I am going to discuss docker that brings similar revolution to how we ship our code to production. We will walk through the basic docker file understanding the best practices. The example that I will be using for the purpose of this post is here. I am using this file to host the API for the vscode extension todolist on Digital Ocean. Download the extension too, you will like it 😉.

What is Dockerfile, docker image and docker containers

Dockerfile

Everything starts with the dockerfile. It is the source code of your image. dockerfile starts with the FROM instruction as we will build our application on top of something, for our case we may need nodejs to run our application. And top of that we will build our stuff. Further we will write lines for what to do with the base image, what additional stuff needs to add from our file system and commands that need to run to start the application.

Docker Image

Using the docker file we can build the image using the docker build command. It will pull whatever base image we satisfied from the docker registry. You can search for the existing images on docker hub. There you can find the image for almost everything you need - nodejs, python, databases, etc. You can host your image too.

Docker Containers

Image will serve as template of you application. Now you can build the container out of this image using docker run command. You can create the number of containers from the single image that we have. You can start stop each container independent of each other.

Key benefits here is this application can run on any system with docker installed on it. So, no need to configure app on different system. Configure only once using dockerfile, build image and run on any system. You can test your application inside a container and ship it. This implies that environment in which you test is identical to the one on which the app will run on production.

With docker a develop can package all the dependencies of the software in the container and rest will be ensured by docker that it works fine on every system.

With this concepts in mind let's move to code.

Walk through Docker file

As already specified docker file we will be using for the purpose of this post uses nodejs and express to build an API. Here is the entire file.

1# Layer 1
2FROM node:14
3
4# Layer 2
5WORKDIR /usr/src/app
6
7# Layer 3
8COPY package.json ./
9
10# Layer 4
11COPY yarn.lock ./
12
13# Layer 5
14RUN yarn
15
16# Layer 6
17COPY . .
18
19# Layer 7
20COPY .env.production .env
21
22# Layer 8
23RUN yarn build
24
25# Layer 9
26ENV NODE_ENV production
27
28# Layer 10
29EXPOSE 3002
30
31# Layer 11
32CMD [ "node", "dist/index.js" ]

Application requires nodejs to run, then as the project is build with typescript we will build the code and start the server. I tagged each line with the layer number because thats how docker treats it. While building image docker create the layer out of each line. If you run docker build command to build image with this file, you will notice that docker proceeds with step 1/11, step 2/11, etc. We will later see how we can use this for our benefit.

Layer 1

1# Layer 1
2FROM node:14

First line is as already specified is about the base image that is required for our application. It can be anything like ubuntu, python. In our case we will need nodejs, so we specified the node and its version. You can find the base images from docker hub and determine which version to use.

Layer 2

1# Layer 2
2WORKDIR /usr/src/app

Using WORKDIR command you can specify the working directory of a docker container at any time. Any command that we will execute after this line like COPY, ADD will be executed in this directory. You can specify any directory here.

Layer 3 and 4

1# Layer 3
2COPY package.json ./
3
4# Layer 4
5COPY yarn.lock ./

As we are working with nodejs application we will be using some npm packages. All packages are specified in package.json file. So before doing anything we need that all our dependencies are installed. So just copied the package.json and lock file to the root of working directory.

Layer 5

1# Layer 5
2RUN yarn

After copying package.json file, we will run yarn command to install all the dependencies. You can run npm install too if using npm instead of yarn.

Layer 6

1# Layer 6
2COPY . .

After dependencies are installed, we will copy all of our project files. We can specify some files or directory to ignore in .dockerignore file.

Here comes the interesting point. First we copied the package.json file and then entire project. But why? Can't we directly copy project in the first place and then yarn install?

Interesting question. Remember I said earlier docker creates the layer out of our dockerfile. When you run the docker build command second time, you will notice that below some initial steps docker prints this line in bash ---> Using cache. Docker caches each layers. If nothing had changed in that layer docker will use cache, but if anything had changed in that layer docker will invalidate the cache of each subsequent layers too.

As we know, generally we don't change package.json file unless add some dependency or add new script. But other files will change regularly. And we don't want to install dependencies even if package.json had not changed. So thats why we copied that file first, ran yarn install and then copied the entire project.

Layer 7 and 8

1# Layer 7
2COPY .env.production .env
3
4# Layer 8
5RUN yarn build

In 7th and 8th layer, we are copying the env file and run yarn build command to compile our typescript code.

Layer 9 and 10

1# Layer 9
2ENV NODE_ENV production
3
4# Layer 10
5EXPOSE 3002

9th layer set the NODE_ENV environment variable to production. and then as our application is running on port 3002, we are exposing that port. The EXPOSE instruction informs Docker that the container listens on the specified network ports at runtime. EXPOSE does not actually publish the port. It serves as the doc between the person who builds the image and the person who runs the container, about which ports are intended to be published. If someone runs this container then he can specify -p flag and knowing that 3002 port is exposed, he can map that port to some port on the localhost.

Layer 11

1# Layer 11
2CMD [ "node", "dist/index.js" ]

This is the final layer and to run our application, run node dist/index.js command. This will start the server.

Running the container

After building image, run the container with docker run -it -p 3002:3002 hardikmodi/vstodo:1.0.0

-it flag meant to run this container interactively. And -p flag is used to publish a container's port or a range of ports to the host. As we have already seen it exposes the port 3002, so I can map that port 3002 port on my machine. Format is <hostPort>:<containerPort>. hardikmodi/vstodo:1.0.0is the convention to name the image. That is<USERNAME_ON_DOCKERHUB>/<PROJECT_NAME>:<VERSION>. This will start the application on port 3002 on localhost too 🎉.

Hope this post helps you in understanding some basic concept of docker like image, containers, layers, cache, etc. I will be also writing about how I took this image and deployed it using digital ocean, so stay tuned and subscribe for the newsletter to never miss an update.

Subscribe for the newsletter