“I learned very early the difference between knowing the name of something and knowing something.”
― Richard P. Feynman
These blog series capture the essence of Feynman’s quote, emphasizing the distinction between superficial knowledge (knowing the name of something) and deep understanding (knowing something). If you want to learn beyond the name of Docker then read away 😊
Let me paint you a picture of our day-to-day life building and deploying software:
- You start building an application using Python.
- First thing first, you install Python. Bam! You get an error in installation 😐
- You troubleshoot this error to find that some missing package caused the error
- You go to install the missing package and re-run the installer
- This time again you get an error for the incorrect version of some dependency 😐
- You find out there are multiple versions of this dependency installed in your system. After spending an enormous amount of time on Google and Stackoverflow you find out how to get past this issue 😑
- Finally, you installed Python and successfully built your application. Now, you want your friend to try out this app. So you share the code and mention the steps to run the app on his system.
- Your friend follows the exact same steps that you mention and still faces errors in installing Python on his system.
- You find out that the fixes that worked for your Mac system do not work for your friend’s Windows 😑
- Every software building and deployment process is this much frustrating and most of our time goes into resolving these environment-specific issues and dependency conflicts.
What are the problems with building & sharing software this way?
- Every software has dependencies — the coding language, database, cache, some configurations, etc. These dependency installations are not straightforward and vary with the environment.
- Many a time, humans miss communicating details when they share their app on how to run the software. For example: In the above case, you ask your friend to install Python and run the application. Your friend runs the application and faces an error only to discover that the code is compatible with Python 3 while he has been working with Python 2.
The Solution: Containerization
Containerization is a software development process that packages the application code, configuration, and its dependencies together, that can run on any environment. The most basic need for containerization is to avoid dealing with software installation issues and work in a standardized environment.
What is Docker?
Docker is a platform that provides containerization. According to docker official documentation:
Docker is an open platform for developing, shipping and running applications.
Essentially, docker is a platform that provides a number of software to create and run applications as containers.
Benefits of Docker
1. Docker provides a standard environment
2. Docker provides portability as it can run on a developer’s local laptop, on physical or virtual machines in a data center, on cloud providers, or in a mixture of environments.
The Docker platform provides a number of software as shown below:
Let's see some of these components in detail.
Docker image is a single read-only file containing instructions to run a specific program. For example, there are docker images for most open-source software like Nginx, MySQL, Redis, etc. This page lists the different available docker images
A Docker Container is a running instance of an image — like a running program.
Containers are lightweight and contain everything needed to run the application from code to dependencies to runtime environment details to isolated set of hardware resources like CPU, network, memory etc.
By default, a container is isolated from other containers on the system. We can create, start, stop, run, or delete a container using the Docker client.
Docker Daemon / Docker Server
Docker Daemon (dockerd) executes the docker client requests. It is responsible for managing images, containers, etc.
Docker Client / Docker CLI
Users interact with the docker server (dockerd) using the docker client.
Docker registry stores images for download. Docker Hub is a free docker registry & the default registry for docker.
Hello World with Docker
Let's run the Hello World of Docker world to understand these components better.
Follow https://docs.docker.com/get-docker/ to install docker.
Run the below command to print Hello World
docker run hello-world
Now here is a series of steps that happened behind the scenes:
- You issue the docker run command to the Docker Client
- The docker daemon checks in the Image Cache if hello-world image is present in your system
- The hello-world image is not found in Image Cache
- Docker Deamon goes to the docker registry to fetch the hello-world image
- The hello-world image is cached in your system’s image cache for future use. This ensures that the next time you run the docker run hello-world command, the daemon will get the image from the image cache on the host and save time in downloading the image from the registry.
- Docker Deamon then created a container out of it. The container runs the program to print Hello from Docker! on your console.
In this way, you can download and run most of the day-to-day software like MySQL, Nginx, Redis, Postgres, etc.
Food for Thought 💡
A computer system from a very high level looks something like this:
- A computer system contains the following components: Hardware, Operating system, Application programs, and the users.
- The hardware devices provide basic computation resources for the system. Ex: CPU, IO Devices, memory, etc.
- The Application Program defines the ways in which we can use the hardware to solve users' problems. Ex: Web browser, Terminal, Spreadsheet, etc.
- The Operating system (kernel) controls the hardware and coordinates the use of these among different applications.
- Namespaces and Control groups are Linux kernel features. Let's see what they do.
Wikipedia defines namespace as:
Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources.
Namespace helps in isolating resources per process. A single namespace is allowed to access a certain segment of CPU, memory, disk, etc. that is isolated from other processes. For example, we can namespace a process to restrict the area of a hard drive that’d be available, or the ability to communicate to other processes, etc.
Control group is another Linux kernel feature that limits the amount of resources (CPU, memory, IO, network, etc.) available to a process or group of processes.
The underlying technologies behind the docker container are Namespace and Control Groups. Docker creates a namespace under the hood for each container thereby ensuring container isolation. To limit the amount of resources, control groups are used.