What is Docker?
Before diving into the concept of Docker, it’s important to first understand What is a Container? Containers are an OS-level construct (typically implemented via namespaces in Linux). Images are what are deployed to these containers via containerization platforms like Docker. More on images below.
But Docker is more than just the images, it is a set of tools to help build and manage these images — from their catalogue (or registry) to the containers that Docker builds with the images that it deploys to them.
Docker is a containerization platform that spans three entities that are often spread out over 2-3 services. Technically, these three tools could be deployed onto a single machine, but this is often not the case (the registry is most often located elsewhere). The three tools are:
- the client (this is where orchestration of the host, containers, images, and registry takes place — the CLI that talks to the host, for example)
- the host (this is where the container runs an image)
- the registry (library of images that can be used to run containers)
The most important component of Docker is its images. Images are deployed to run on containers, and they contain the operating system and other OS-level packages. Docker uses
Dockerfiles to define the construction of these images. These Dockerfiles can be chained on top of one another to add new tools and actions to the image with each layer. For example, this is an example Dockerfile for a base image:
ADD root_filesystem.tar /
However, building a base image from scratch like this is a rare practice. Usually, you build on top of an existing base image, for example:
COPY . /app
RUN make /app
CMD python /app/app.py
The above Dockerfile copies files that exist in the same directory as this Dockerfile into the filesystem that is deployed from the image that it is built from.
ubuntu:22.04 is the base image. This blog post explains how to build a Docker image with Dockerfile.
A couple important notes:
- With tools like Kubernetes, building the Dockerfile is handled for you. However, you need to define the container in a Kubernetes Deployment resource to deploy it.
- The layering of Dockerfiles does not map 1-to-1 to the layering of
.tarfiles that are downloaded and deployed from the registry to the container. Some Docker commands do result in new layers while others don’t.
Docker adheres to standards set by the Open Container Initiative (OCI). This means that the components mentioned above do not depend on each other. You can use a different client to communicate with the host. For example, Kubernetes uses its own container runtime to communicate with containers. This is made possible via the Container Runtime Interface (CRI) which provides a standardized API for clients communicating with hosts.
Docker is the most popular container management platform in existence. Other container management solutions are typically considered only when some limitation in Docker is encountered. Docker is often used in other container management systems such as Kubernetes. Kubernetes is intended for large scale clusters of services. Using plain Docker is typically intended for smaller-scale applications.
This deeper dive will include Kubernetes given that it’s an important part of the container ecosystem.
The client can be any tool that uses the Container Runtime Interface (CRI) to communicate with the container runtime(s) in The Host. If Docker is used End-To-End, the client-side tool is the Docker CLI. One alternative is
kubectl which is more than just an alternative to Docker CLI.
kubectl is the Kubernetes CLI for the Kubernetes API, which then communicates with Kubelets on Kubernetes Nodes, which then control containers in Kubernetes Pods.
The Host is the physical or virtual machine where containers are created. Each container runs one image.
In Docker, the host is composed of:
dockerd: Manages all the components listed below.
containerd: Manages the entire lifecycle of a container, as well as images in containers. This is done via a
containerd-shimprocess per container.
- Docker Networking: iptables, routing tables, Docker network drivers, and namespaces.
- Docker Storage: The actual management of storage at a lower level is handled by storage drivers and, in some cases, by
In Kubernetes, the host is composed of:
- Kubelet: Manages containers in pods.
- Container runtime:
containerdor similar (supported through Container Runtime Interface plugin).
- Kube-Proxy: Manages network communication to pods across the Kubernetes cluster.
See more info on Pods and Nodes in Kubernetes.
Images are stored in the registry to be pulled by the containers when they start. This is orchestrated by The Client.
Docker revolutionized containerization which opened the door for microservices architectures (learn more at Microservices vs Monolithic). It can be viewed as a tree trunk that spawned many branches which have themselves revolutionized the software industry. Another important branch it spawned is Serverless architecture.
Sam Malayek works in Vancouver, using this space to fill in a few gaps. Opinions are his own.