Explained Logo

Introduction

Hello everyone, and welcome back to my blog. In this crash course we will be looking at one of the more popular and well known containerization technologies on the market, Docker. Now, this is going to be a longer, in-depth posts, so if that isn’t your thing, you might want checkout my Docker CLI Cheatsheet. However, if you like to get a better understanding of Docker and how it works, then you came to the right place. All this posts assumes is that you have docker installed on your PC. With that said, let’s begin.

What is Docker?

Docker is a tool for running applications in isolated environments, otherwise known as a container.

Now if you have been in tech for awhile, you might be thinking, Wait, isn’t that just a VM? and in any other circumstance you would be correct. However, Docker and other containers are different from VMs as they:

  1. Run containers in seconds instead of minutes
  2. Have less resources, resulting in less disk space usage
  3. Use less memory
  4. And, do not need the overhead of a full OS

This means that spinning up a docker container is extremely quick as it doesn’t load anything that is not needed. Another added benefit of docker is that since the application runs in the same environment, it will work on any machine that can run docker. If the application runs on your Mac or Windows PC it will run on your Linux server just the same. It is because of this that docker and other containerization technologies are gaining wide spread support in businesses across the board.

Containers vs. Virtual Machines

To go a bit deeper, lets examine containers and virtual machines a little bit more. Let’s begin by defining what a container is and what a virtual machine is.

  • Containers are an abstraction at the app layer that packages code and dependencies together. Multiple containers can run on the same machine and share the OS kernel with other containers, each running as isolated processes in user space.

  • Virtual machines (VMs) are an abstraction of physical hardware turning one server into many servers. The hypervisor allows multiple VMs to run on a single machine. Each VM includes a full copy of an operating system, the application, necessary binaries and libraries - taking up tens of GBs. VMs can also be slow to boot.

To make that a little bit easier to understand, lets look at Figure 1

Figure 1

Here you can see perfectly why containers are such a useful application in comparison to VMs. On the left-hand side, you see a system for containerized applications. The only overhead of that system is docker. On the right-hand side, you see a system for VMs. That system’s overhead is not only the hypervisor, but also each VMs’ Guest OS. As defined prior, this can be extremely wasteful in terms of system resources as each VM contains possibly unneeded items. The way docker solved this was by running applications in pre-built abstractions known as images.

Getting Our Hands Dirty

Now that we have some basic information down, lets start actually using docker. To start, lets mess with some images, specifically in this case the nginx image.

Docker Images

An image is a template for creating a sandboxed environment for solely running your application

To see an image in action, run this command:

docker pull nginx && docker run -d -p 8080:80 nginx:latest 

Then, you go to localhost:8080 in your browser, you will see this:

Nginx Welcome Page

If you were curious and looked at my Docker Cheatsheet, then you would know that the first docker command pulls the nginx image from DockerHub. The second command runs the latest version of a nginx container in the background and points port 80 FROM the nginx container TO port 8080 on your machine.

You can check the installed images by running

docker images

Output:

REPOSITORY        TAG                  IMAGE ID       CREATED        SIZE
nginx             latest               b692a91e4e15   6 days ago     142MB

Docker Process and Format

Next, copy this alias into your bashrc if you run bash or your zshrc for zsh:

alias dps='docker ps --format="ID\t{{.ID}}\nNAME\t{{.Names}}\nIMAGE\t{{.Image}}\nPORTS\t{{.Ports}}\nCOMMAND\t{{.Command}}\nCREATED\t{{.CreatedAt}}\nSTATUS\t{{.Status}}\n"'

What this does is format the output of the docker ps command so that its more readable. Go ahead and run these commands in your terminal to see the difference.

docker ps
dps

Output:

CONTAINER ID   IMAGE          COMMAND                  CREATED       STATUS         PORTS                                   NAMES
0cf13de873ae   nginx:latest   "/docker-entrypoint.…"   2 hours ago   Up 9 seconds   0.0.0.0:8080->80/tcp, :::8080->80/tcp   thirsty_montalcini
---
ID	0cf13de873ae
NAME	thirsty_montalcini
IMAGE	nginx:latest
PORTS	0.0.0.0:8080->80/tcp, :::8080->80/tcp
COMMAND	"/docker-entrypoint.…"
CREATED	2022-08-08 15:11:45 -0500 CDT
STATUS	Up 20 seconds

Docker Names

As you can see from either of the previous commands, our container can be identified from either its ID or its name. This is very useful for starting or stopping that container or even checking its logs. However, its name is kind of hard to remember as its randomly generated. We can thankfully create an easier to remember name for our containers. To do this, first lets stop the current running container.

docker stop thirsty_montalcini

Next, lets create a new container with a name of mysite:

docker run --name mysite -d -p 8080:80 nginx:latest

Finally, run dps to check the name

dps
---
ID	9436589ac4c4
NAME	mysite
IMAGE	nginx:latest
PORTS	0.0.0.0:8080->80/tcp, :::8080->80/tcp
COMMAND	"/docker-entrypoint.…"
CREATED	2022-08-08 17:48:09 -0500 CDT
STATUS	Up 2 seconds

Docker Ports

Even though we brushed over ports with docker, lets go over some commands that include it. To recap, the -p flag of the docker run command requires a “destination” port and an “source” port in the form of DESTINATION:SOURCE. You can also use more than one port in a container. To test this lets run nginx on port 3000 and 4000:

docker run --name multiport -d -p 3000:80 -p 4000:80 nginx:latest

Next, run dps to check the ports on the container.

ID	31875b1e2d0f
NAME	multiport
IMAGE	nginx:latest
PORTS	0.0.0.0:3000->80/tcp, 0.0.0.0:4000->80/tcp, :::3000->80/tcp, :::4000->80/tcp
COMMAND	"/docker-entrypoint.…"
CREATED	2022-08-08 21:32:38 -0500 CDT
STATUS	Up 4 seconds

Finally, check localhost:3000 and localhost:4000 in your browser. You should see the nginx welcome page on both tabs

Docker Management

There are a few commands that make managing docker images and containers easier such as how to remove containers, how to get an interactive terminal, and how to list containers.

Listing containers

docker container ls

Interactive Terminal

docker exec -it multiport sh

Remove a Container

docker rm multiport

Remove a Running Container

docker rm -f multiport

Docker Volumes

With Docker Volumes, you can share files and folders from your host machine or docker container to another docker container.

Between host and container

docker run --name website -v $(pwd):/usr/share/nginx/html:ro -d -p 8080:80 nginx

In this command, the -v flag is used to take the files and folders from the current directory and place them inside the nginx docker container at /usr/share/nginx/html in a read-only or ro state.

Between container and container

docker run --name website-copy --volumes-from website -d -p 8081:80 nginx

In this command, the --volumes-from flag mounts volumes from the specified container, in this case website, to the docker the called it, i.e. website-copy.

Advanced Docker

Now that we’ve gotten past the basics of Docker, lets move on to the more advanced subject of image creation. This will include information about:

  • Dockerfiles or how to write your own images
  • Caching and Layering, how writing your Dockerfile a certain way can effect your image
  • Alpine Linux, a way to slim down your image to an extremely small size

Dockerfile

With a Dockerfile you can create your own template, defining how you would like your environment to be setup exactly for you application to run. A Dockerfile can also be used as a snapshot tool, so if an error is made with a newly create image, you could roll back to the previous version with little issue. Please note that this post is not all encompassing and will not show every command you can use in a Dockerfile. For that I recommend that you check out the Dockerfile Reference.

To get started, lets create a directory and add a basic html template file inside.

mkdir dockerfile-test
cd dockerfile-test
touch index.html

index.html

<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>

<h1>This is a Heading</h1>
<p>This is a paragraph.</p>

</body>
</html>

Next, lets create a Dockerfile. With your favorite text editor add this information inside of it and save it as Dockerfile. Make sure there are no file extensions at the end.

Dockerfile

FROM nginx:alpine
ADD . /usr/share/nginx/html

To break this down, Dockerfiles are written with an instruction which are in all caps and a value which are lower-cased for the most part. In this Dockerfile the FROM instruction states what docker image (i.e. nginx) and tag (alpine) to use when building the template. The next line uses the ADD instruction to add the files from the current working directory to the /usr/share/nginx/html directory inside the template.

To turn this Dockerfile into a docker image, you need to build it. You do so by running:

docker build -t myimage:alpine .

This will build the image myimage and tag it as alpine.

If by any chance there are files or folders that you do not want inside the template, simply create a .dockerignore file in the same directory and add the names of the files and folder you do not want copied. For example, lets say you initialize a git repo inside of the dockerfile-test directory. You probably don’t want a copy of the .git directory or .gitignore file inside of your image. Simply create the .dockerignore file and add:

.dockerignore

.git
.gitignore

Docker Caching/Layering

Docker Caching is the process of storing information so that it can be reused for future events.

Docker Layering is the technique of using cache to create small, fast docker images

For example, lets say you are building a nodeJS project and need to build an image. You write your Dockerfile as so:

FROM node:latest
WORKDIR /app
ADD . .
RUN npm install
CMD node index.js

However, since you have a lot of dependencies, your container takes a long time to build. This is because this Dockerfile is not fully taking advantage of caching. On line 4 of the Dockerfile, the image has to always re-download all of the dependencies your application would need to run. This is very inefficient and time consuming.

To change this, you should know where node stores information regarding dependencies, in this case any of the package.json files. Let’s go ahead and add those lines above the ADD instruction:

FROM node:latest
WORKDIR /app
ADD package*.json ./
RUN npm install
ADD . .
CMD node index.js

Now rerun the build command:

docker build .

You should see an increase in speed the next time you build your image!

Alpine Linux

Docker images based off of Alpine Linux are usually much smaller in size compared to their vanilla counterparts. To get Alpine as your docker image, first you have to see if the alpine tag is supported under your specific docker image. Take for example nginx. If you search nginx on hub.docker.com you will see under the nginx image page a bunch of tags related to different nginx versions as well as Linux distros. Look for a tag with alpine in the name. To download it onto your local machine run:

docker pull nginx:[tag you want to use]

For me, I just usually go with the simple alpine tag.

docker pull nginx:alpine

Now compare the image sizes by running

docker images

Wrapping it up

Wow, that was a lot of data in one post. You went from just starting and stopping an image to building a fully functioning Dockerfile and image of your own. Congrats 🎉!! Now go outside and touch some grass, I know I need too 😄. If you like this style of content, please feel free to look at other posts in the Explained category.