What is Docker?
January 27, 2024
Background
Before diving into the concept of Docker, it's important to first understand What is a Container? Containers are an OS-level construct (typically implemented via namespaces in Linux). Images are what are deployed to these containers via containerization platforms like Docker. More on images below.
But Docker is more than just the images, it is a set of tools to help build and manage these images -- from their catalogue (or registry) to the containers that Docker builds with the images that it deploys to them.
Overview
Docker is a containerization platform that spans three entities that are often spread out over 2-3 services. Technically, these three tools could be deployed onto a single machine, but this is often not the case (the registry is most often located elsewhere). The three tools are:
- the client (this is where orchestration of the host, containers, images, and registry takes place -- the CLI that talks to the host, for example)
- the host (this is where the container runs an image)
- the registry (library of images that can be used to run containers)
This architecture is described in the official Docker Overview. Another helpful resource is AWS's What is Docker?.
Images
The most important component of Docker is its images. Images are deployed to run on containers, and they contain
the operating system and other OS-level packages. Docker uses Dockerfile
s to define the construction of these images.
These Dockerfiles can be chained on top of one another to add new tools and actions to the image with each layer. For example,
this is an example Dockerfile for a base image:
FROM scratch
ADD root_filesystem.tar /
CMD ["/bin/bash"]
However, building a base image from scratch like this is a rare practice. Usually, you build on top of an existing base image, for example:
FROM ubuntu:22.04
COPY . /app
RUN make /app
CMD python /app/app.py
The above Dockerfile copies files that exist in the same directory as this Dockerfile into the filesystem that is deployed from the image that it is built from.
ubuntu:22.04
is the base image. This blog post explains how to build a Docker image with Dockerfile.
A couple important notes:
- With tools like Kubernetes, building the Dockerfile is handled for you. However, you need to define the container in a Kubernetes Deployment resource to deploy it.
- The layering of Dockerfiles does not map 1-to-1 to the layering of
.tar
files that are downloaded and deployed from the registry to the container. Some Docker commands do result in new layers while others don't.
Interoperability
Docker adheres to standards set by the Open Container Initiative (OCI). This means that the components mentioned above do not depend on each other. You can use a different client to communicate with the host. For example, Kubernetes uses its own container runtime to communicate with containers. This is made possible via the Container Runtime Interface (CRI) which provides a standardized API for clients communicating with hosts.
Usage
Docker is the most popular container management platform in existence. Other container management solutions are typically considered only when some limitation in Docker is encountered. Docker is often used in other container management systems such as Kubernetes. Kubernetes is intended for large scale clusters of services. Using plain Docker is typically intended for smaller-scale applications.
High-Level Components
This deeper dive will include Kubernetes given that it's an important part of the container ecosystem.
The Client
The client can be any tool that uses the Container Runtime Interface (CRI) to communicate with the container
runtime(s) in The Host. If Docker is used End-To-End, the client-side tool is the Docker CLI. One alternative is kubectl
which is more than just an alternative to Docker CLI. kubectl
is the Kubernetes CLI for the Kubernetes API, which then
communicates with Kubelets on Kubernetes Nodes, which then control containers in Kubernetes Pods.
The Host
The Host is the physical or virtual machine where containers are created. Each container runs one image.
In Docker, the host is composed of:
dockerd
: Manages all the components listed below (the daemon).containerd
: Manages the entire lifecycle of a container, as well as images in containers. This is done via acontainerd-shim
process per container.- Docker Networking: iptables, routing tables, Docker network drivers, and namespaces.
- Docker Storage: The actual management of storage at a lower level is handled by storage drivers and, in some cases, by
containerd
.
In Kubernetes, the host is composed of:
- Kubelet: Manages containers in pods.
- Container runtime:
containerd
or similar (supported through Container Runtime Interface plugin). - Kube-Proxy: Manages network communication to pods across the Kubernetes cluster.
See more info on Pods and Nodes in Kubernetes.
The Registry
Images are stored in the registry to be pulled by the containers when they start. This is orchestrated by The Client.
Conclusion
Docker revolutionized containerization which opened the door for microservices architectures (learn more at Microservices vs Monolithic). It can be viewed as a tree trunk that spawned many branches which have themselves revolutionized the software industry. Another important branch it spawned is Serverless architecture.
Updated: 2024-03-09