docker
containers
Why docker works the way it does
This is not a tutorial or even remotely close to an article that explains how things work or how to do a particular task. This is the answer to a question that was in a limbo state of ‘So I know how this is supposed to work and why it’d work so well…but, then, why does the evidence say the opposite?’. Let me explain.
In the sequence of lessons in Docker 101, or more precisely, Container 101 - the first lesson is the difference between a Virtual Machine and a Container. The entire point of a container is one-upping VMs by being lightweight, quick, and developer-friendly. And a key component in the differences is that containers use the Host OS instead of having its own OS.
I have heard this countless of times. This is how it is supposed to work. And apparently this is how it works, or else why would companies use it and engineers vouch for it. If the containers running on a system will necessarily need to use the Host’s OS, then why is it that (1) we still need to define a base image with the OS; (2) we can define a base OS different from the Host OS?
Before I understood the actual reasons, I had a vague idea that it had to do something with linux kernels being shared between distributions, but the idea still never sat right with me. How does it work so seamlessly - how is the difference between OS versions or distributions reflected in the container? How does Mac/Windows handle it - they don’t share kernels?
I know I said that this is not a tutorial. And this is definitely not a ‘what-is’ post, but I think it would do good for anyone new to docker if we have certain basics listed out here. Skip if you’re aware of basics of docker.
- Container solves the problem of dependency hell. Dependency hell is when the program runs on Machine A, but fails to run (almost) replicately in Machine B. It can be a diference in dependency versions or any other configuration.
- It solves it by packaging up the application and all of its dependencies and configurations into a single box, or container.
- Docker virtualises the OS, ie, only our application and dependencies are packaged in the box while the OS is being consumed from outside the container. Docker has direct access to Host OS resources. This is also why it is lightweight than Virtual Machines.
- Quick terms to look up: docker daemon, docker engine, docker image, docker container, docker registry, dockerfile.
Now, back to the question in concern. To reiterate - Why do we use an OS base image when Docker has no concept of Guest OS and Other Questions.
Before we move forward, note that this section strictly talks about running a different Linux Distro version on another. Unless the context swithces to a non-Linux backed OS (MacOS/Windows) explicitly, consider Linux as the default.
To understand this, we need to understand what makes up an Operating System. Before I knew better, I said the kernel is the OS. This is true for most parts. The Linux kernel makes up most of the OS that we know and use today. The kernel can take care of all major logic for operations like networking, managing IO, time slicing the CPU, etc. However, when we talk of the OS as a whole, there are certain other components that sit on top of the kernel. In addition to kernels, the OS is also made up of packages and userland. Which is why it also answers the question: if all distros use the same kernel, how do they then differ. They differ because some part of the identity of a distro comes from the userland.
Back to how the composition of kernel and userland helps the existence of docker. Docker makes the virtualisation of the OS easy due to the fact that all linux distributions share (almost) the same kernel and differ mostly only in their userland software. Thus, if we try to install an Ubuntu container inside a Mint host, it means that the container will get the userland from Ubuntu while running the kernel provided by the Host (Mint) OS. Running Ubuntu userland on a Mint kernel will pretty much give us a very close simulation of an Ubuntu system than a Mint system.
Okay, this helped explain a lot of things to my head. But I still have further questions:
(1) Operating Systems are so complex! Even when sharing the same kernel, there has to be instances of incompatibility or things going wrong!
And that is completely true! If an application or program inside the container does not work on the Host OS Kernel, then the program just won’t run. An example I found online - Take a CentOS:5 or a CentOS:6 container on a Linux kernel that is equal to greater than 4.19. This setup segfaults when running bin/bash
because the kernel and the userland programs are not compatible. Thus, if your program or application attempts to use facilities that are not supported by the kernel, it will simply fail.
(2) How does this explain Linux based containers running on MacOS or Windows?
I was aware that running linux-based-OS images on Windows was a hassle, or at least appeared to be a hassle. But even after the ‘hassle’, how did it work? I was also aware that running similar containers on MacOS apparently wasn’t as much a hassle and things were relatively smoother. So why did this work without the hassle then?
For Windows and MacOS, there used to be an extra interim layer, called boot2docker, that took care of spinning up a linux virtual environment. I don’t know the specifics of it, so we will roll with assuming that this layer took care of introducing some form linux virtualization to these OSes where the linux kernel was unavailable.
Bottomline: no, windows and mac cannot run linux containers directly. When installing Docker on these systems and using Linux based containers, it generally installs a Linux VM on which containers can be run. Mac’s “Docker Desktop for Mac” takes care of a lot of abstraction so the kernel-specifics are not exposed to the user and takes care of running the linux containers smoothly without much hassle.