In the pre-virtualization era, applications were typically executed directly on the OS system, often referred to as “bare metal”. You are likely already familiar with this concept as it is what you do when you install and run programs on your personal computer or laptop. There is no “virtualized” layer, everything is running in the same environment.
Bare metal systems involve installing and running the operating system, binaries/libraries, and applications directly on the physical hardware. While it provides direct hardware access and customization options, it can lead significant challenges in production systems:
Virtual machines (VMs) are an abstraction layer that enables the creation and operation of multiple isolated instances of virtual hardware within a single physical host. This is made possible through a software component called a hypervisor or “virtual machine monitor” (VMM), which partitions the host resources and allows each VM to have its own virtualized hardware configuration, including an operating system, binaries/libraries, and applications. VMs provide a way to consolidate multiple independent environments on a single physical server, offering flexibility, isolation, and the ability to run different operating systems and software stacks concurrently.
VMs offer several advantages:
A cluster of physical hardware can support the operation of thousands of virtual machines (VMs), enabling efficient resource utilization, scalability, and to recover from failure in minutes.
Containers offer a similar level of isolation for installing and configuring binaries/libraries as virtual machines, but they differ in their approach. Instead of virtualizing at the hardware layer, containers utilize native Linux features like cgroups and namespaces to provide isolation while sharing the same kernel. This lightweight approach allows for faster startup times, improved resource efficiency, and easier scalability compared to traditional virtual machines.
Containers are an abstraction at the app layer that packages code and dependencies together. Multiple containers can run on the same machine and share the OS kernel with other containers, each running as isolated processes in user space. Containers take up less space than VMs (container images are typically tens of MBs in size), can handle more applications and require fewer VMs and Operating systems.
Virtual machines (VMs) are an abstraction of physical hardware turning one server into many servers. The hypervisor allows multiple VMs to run on a single machine. Each VM includes a full copy of an operating system, the application, necessary binaries and libraries – taking up tens of GBs. VMs can also be slow to boot.
The technology of containers plays an important role in modern software development and deployment, for several reasons:
A “container image” (or shortly, image) is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings. You can transfer images from one machine to the other. Every machine is able to “run the image” without the need to install the application dependencies, define environment variables and networking settings.
A container is a single running instance of an image. You can create, start, stop, move, or delete a container, and they can be run easily and reliably from one computing environment to another. The computer that runs the container is frequently referred to as a host machine, because it “hosts” containers.
By default, a container is relatively well isolated from other containers and its host machine. You can control how isolated a container’s network, storage, or other underlying subsystems are from other containers or from the host machine.
A container, as we mentioned, is defined by its image as well as any configuration options you provide to it when you create or start it. When a container is removed, any changes to its state that are not stored in persistent storage disappear.
Under the hoods, containers are merely a linux process.
But they are unique processes that “live” in an isolated environment. By this means, the process “believes” that he is the only process in the system, he is containerized. Containers are a technology that leverages the Linux kernel’s features to provide lightweight and isolated environments for running applications.
Linux containers utilize several key components:
Docker is an open platform for developing, building and shipping images, and running containers.
Docker container technology was launched in 2013 as an open source Docker Engine.
Docker uses a client-server architecture. The Docker client talks to the Docker daemon, which does the heavy lifting of building, running, and distributing your Docker containers. The Docker client and daemon can run on the same system, or you can connect a Docker client to a remote Docker daemon. The Docker client and daemon communicate using a REST API, over UNIX sockets or a network interface. Another Docker client is Docker Compose, which lets you work with applications consisting of a set of containers.
dockerd
) listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes.docker
) is the primary way that many Docker users interact with Docker. When you use commands such as docker run, the client sends these commands to dockerd, which carries them out.Please install Docker if you haven’t done it before.
Tip: you can add your user to the docker
group, so you could use the docker
command without sudo
:
sudo usermod -aG docker $USER
Upon up and running docker
installation, the docker version
command output may look like:
Client: Docker Engine - Community
Version: 20.10.22
API version: 1.41
Go version: go1.18.9
Git commit: 3a2c30b
Built: Thu Dec 15 22:28:02 2022
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.22
API version: 1.41 (minimum version 1.12)
Go version: go1.18.9
Git commit: 42c8b31
Built: Thu Dec 15 22:25:51 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.14
GitCommit: 9ba4b250366a5ddde94bb7c9d1def331423aa323
runc:
Version: 1.1.4
GitCommit: v1.1.4-0-g5fd4c4d
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Note that docker is running as a service on your system, hence can be controlled by systemctl
:
$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2023-05-07 09:56:45 IDT; 5min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 261600 (dockerd)
Tasks: 123
Memory: 209.5M
CGroup: /system.slice/docker.service
└─261600 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
May 07 09:56:39 hostname dockerd[261600]: time="2023-05-07T09:56:39.509330916+03:00" level=warning msg="Your kernel does not support CPU realtime scheduler"
May 07 09:56:39 hostname dockerd[261600]: time="2023-05-07T09:56:39.509350949+03:00" level=warning msg="Your kernel does not support cgroup blkio weight"
May 07 09:56:39 hostname dockerd[261600]: time="2023-05-07T09:56:39.509365744+03:00" level=warning msg="Your kernel does not support cgroup blkio weight_device"
May 07 09:56:39 hostname dockerd[261600]: time="2023-05-07T09:56:39.533460844+03:00" level=info msg="Loading containers: start."
May 07 09:56:42 hostname dockerd[261600]: time="2023-05-07T09:56:42.942065068+03:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
May 07 09:56:43 hostname dockerd[261600]: time="2023-05-07T09:56:43.248511892+03:00" level=info msg="Loading containers: done."
May 07 09:56:44 hostname dockerd[261600]: time="2023-05-07T09:56:44.943477277+03:00" level=info msg="Docker daemon" commit=42c8b31 graphdriver(s)=overlay2 version=20.10.22
May 07 09:56:44 hostname dockerd[261600]: time="2023-05-07T09:56:44.972157071+03:00" level=info msg="Daemon has completed initialization"
May 07 09:56:45 hostname dockerd[261600]: time="2023-05-07T09:56:45.533037228+03:00" level=info msg="API listen on /var/run/docker.sock"
May 07 09:56:45 hostname systemd[1]: Started Docker Application Container Engine.
From docker’s service status output we can learn a few important properties of the docker client and daemon.
When the Docker client (docker
) and daemon (dockerd
) are on the same machine (usually the case),
they communicate using a UNIX socket located in /var/run/docker.socket
, typically via RESTful API endpoints.
When the client and daemon are not on the same machine, they communicate over the internet via HTTPS protocol.
What else can we learn about the docker daemon? that it does not run containers itself! Docker relies on the containerd
service to manage containers lifecycle.
Containerd is an open-source container runtime that provides a high-level interface for managing container lifecycle and execution. It is serving as the underlying runtime for various container platforms, including Docker.
Containerd, in turn, uses runc
as the default OCI-compliant runtime for actually running containers. Containerd utilizes runc to execute the container processes, manage resource isolation, and handle low-level interface container operations according to the OCI specification.
To summarize, containers are not exclusive to Docker, they are a broader technology and concept that existed before Docker’s popularity. Docker popularized and simplified the adoption of containers by providing a user-friendly interface and tooling, but there are alternative container runtimes and platforms available, such as Podman, that leverage containers for application deployment and management.
Under the hood, runc
does the dirty job of running containers:
In a moment, you’ll run you first container, this is the execution order of the different components that responsible for the container execution:
The Docker “Hello World” container is a simple and lightweight container that is often used to verify if Docker is properly installed and functioning on a system. It is based on the official Docker image called hello-world.
docker run hello-world
The hello-world image is an example of minimal containerization with Docker.
It has a single hello.c
file responsible for printing out the message you’re seeing on your terminal.
Enter the interactive self-check page
As you may know, Docker is designed in client-server architecture, where both sides are not necessarily running on the same machine. Your goal is to run the docker daemon (the server) of a different machine, and communicate with it from your local machine.
Feel free to find useful tutorials either in Docker’s official docs or any other resource. You can use some EC2 instances as the remote machine.