What is a Container?
Container (generic context)
In a generic context within computer science, container can refer to any construct that holds data in memory. Examples are endless, but some include:
- An array or list in any language.
- A struct in any language that supports them.
- An object created from a class in any language that supports classes.
Container (common context)
More commonly, container refers to a containerized operating system environment. Most commonly, the engine used to drive this containerization is Docker. A container is like a separate instance of your operating system that is isolated from the base operating system (as well as other containers). It’s important to note that Docker containers are not 100% isolated from each other. They’re technically referred to as lightweight isolation environments (on Linux systems, Docker utilizes the Namespace feature of the Linux kernel). Processes running in one container are visible to other containers (although likely with a different Process ID). Files saved in one container are not visible to another container, however, it is possible to mount a storage volume to multiple Docker containers to share files between them. This lightweight level of isolation provides good performance and has many use cases:
- Cleaner way to develop multiple applications at once on a developer’s own machine.
- Portability of an application developed within a container. This portability makes it easy to move the application from development to production. It also reduces the risk of compatibility issues.
- Scalability is achieved in containerization through the fact that the container is not a virtual machine (completely isolated from its host). It’s a lightweight isolation environment, and so although limits can be applied on its resources, those limits can also be adjusted during runtime.
- Faster deployments are also achieved through the fact that the container shares the host machine’s kernel, so fewer components need to be started in order to start the container (compared to a virtual machine).
- Improved Security of containerized applications is achieved through isolation. For example, if a container is compromised through an exposed network, gaining access to the host from within the container is another hurdle for the attacker.
- Improved resource utilization compared with entire virtual machines because of the shared kernel.
Following the release of Docker, the Open Container Initiative was released so that an image which adhered to the OCI standard could be run on any container runtime that supported the standard. Other container runtimes include:
- Kata: These containers use virtual machines for improved isolation, rather than Docker’s use of Linux namespacing.
- gVisor: Released by Google, these containers are run in a sandbox (halfway between Linux namespacing and full-on machine virtualization).
- Firecracker: Released by AWS, this container system powers AWS Lambda and AWS Fargate. It’s specifically designed for the serverless use-case.
Below are the most important components of Docker that should be noted by a novice who asks ‘what is a container’:
- The Image contains instructions for creating a container. An image is typically tailored to a single platform (CPU architecture), but multi-platform images can also be built. Images include system configurations, system libraries, and other dependencies needed to run an application, given the application’s requirements. Images can be layered on top of one another via Dockerfiles.
- The Container is an instance of an image where one may be building a specific application with specific system configurations and filesystem structure.
- The Daemon is a background process that manages Docker objects, such as images, containers, networks, and volumes.
Other important components:
- Storage volumes were mentioned earlier in this post.
- Docker's networking plays a big role in its magic of routing traffic from within a container to a host's external port and vice versa.
For a look under the hood of Docker, see this CodeMentor blog post.
Container vs Virtual Machine
Virtual machines have been mentioned throughout this post. They are often compared with containers, and in some use cases, virtual machines are required to run a container. Atlassian has already explained the difference between a virtual machine and a container. To sum it up, virtual machines require a Hypervisor to run. They are responsible for dividing hardware resources such as memory, CPU, power, and network bandwidth. Containers, on the other hand, share these resources. A container’s resources can be limited, but those limits can be adjusted at runtime (after starting). Most virtual machines do not allow changing its allocated resources after starting. Virtual machines offer a much deeper level of isolation, making them more secure, but also slower to start.
Nowadays, all Docker installations require virtualization, as noted on the requirements of the installation notes (even Linux’s Docker installation). However, not all container instantiations of Docker images require virtualization. This Stackoverflow post explains the situation regarding Docker’s ability to run natively on a platform. To sum it up, Docker containers offer various levels of isolation with the
docker run command. Some of these isolation levels provide machine virtualization, so it’s listed as an installation requirement for all operating systems.
|Virtual Machines||Isolation: VMs provide complete hardware and software isolation, ensuring that one VM's issues won't affect others.|
Compatibility: VMs can run any operating system, making it possible to run legacy applications on newer hardware.
Resources: VMs can be assigned a specific amount of resources, such as RAM, CPU, and disk space, which are isolated from the host and other VMs.
|Overhead: VMs require a full operating system, leading to increased disk space and memory usage, as well as slower startup times.|
Performance: VMs can have lower performance compared to containers, due to the overhead of the virtualized hardware.
Complexity: VMs can be more complex to manage, as they require more resources and specialized skills compared to containers.
|Containers||Portability: Containers can be easily moved from one host to another, making it easy to deploy and scale applications.|
Lightweight: Containers are lightweight, as they share the host operating system and resources, leading to reduced disk space and memory usage compared to VMs.
Efficiency: Containers can start up faster and have better resource utilization compared to VMs, as they do not have the overhead of a full operating system.
Simplicity: Containers are easier to manage, as they have a smaller footprint and can be easily orchestrated using tools like Docker and Kubernetes.
|Isolation: Containers provide process-level isolation, but the host operating system is shared, so a security breach in one container can potentially affect others.|
Compatibility: Containers rely on the host operating system and can only run on compatible systems, making it difficult to run legacy applications.
Resources: Containers are limited by the resources of the host, as they do not have the ability to assign specific amounts of resources like VMs.
More information on Docker’s use of the Hypervisor with MacOS:
Relation to Kubernetes
Kubernetes is a container orchestration system that automates the deployment, scaling, and management of containerized applications. Kubernetes provides several features such as automatic scaling, self-healing, and rolling updates that can be applied to Docker containers, making it easier for developers to manage their applications in a production environment. Docker provides the packaging and runtime for containers, while Kubernetes provides the orchestration and management of those containers in an environment.
Use in Cloud Platforms like AWS
The portability of containers is utilized in Cloud Platform services, such as AWS’s Elastic Container Service, where containers can be launched and scaled with greater ease and performance, with features such as:
- Automatic scaling of container resources.
- Built-in load balancing.
- High-availability where failed containers are automatically replaced.
Larger Cloud Platforms like AWS also tend to offer container registries, like Elastic Container Registry. They also tend to offer managed Kubernetes services, like AWS’s Elastic Kubernetes Service.
Sam Malayek works in Vancouver for Amazon Web Services, and uses this space to fill in a few gaps. Opinions are his own.