Docker is great, it fixed the whole “it works on my machine” statement, mostly by making the counter: “We’ll ship your machine then” actually realistic. Reasonably, we started building docker images that have a minimal file system, including only the things you actually need. So no full blown Ubuntu install that includes creature comforts like ping
, curl
, or vi
. This is a reasonable approach considering that a docker image is meant to run your software, and not meant to serve as an operating system a human can use to do anything meaningful.
This approach however does make it harder to ever troubleshoot any issues that may come up when your application is running. You can of course docker exec -it some_name /bin/sh
, which might give you a shell in the container, but then you start encountering things like this:
$ sudo docker run -it --rm node /bin/sh
# ping
/bin/sh: 10: ping: not found
# vi
/bin/sh: 11: vi: not found
# vim
/bin/sh: 12: vim: not found
# nano
/bin/sh: 14: nano: not found
For docker images that run as the root user, this is mostly an inconvenience. A series of yum/apk/apt
bruteforcing till you’ve figured out the underlying package manager can help you install a binary, and as long as you get rid of the container after, this works well enough. But ideally, docker images run as non-root users when possible. Combined with a lack of sudo
this makes it impossible to install any packages you might need to troubleshoot a certain issue.
Enter lsns and nsenter
If we’re being slightly reductive, a docker container isn’t much more than a kernel namespace and a chroot into the docker image. Docker is not virtualization, and instead uses kernel namespaces to isolate containers from each other. This is also why a sudo ps aux
will show you the nginx process running in your hypothetical nginx docker container. The nginx process is still managed by your kernel, it’s just isolated in a different namespace. If you could switch to the namespace the nginx process is running in, you could bring all the binaries of your host system, and poke away to your heart’s content. That’s where lsns and nsenter come in. lsns
as the name implies, allowes you to list all namespaces currently active. You can then use nsenter
to switch to different namespace.
Assume that you’re running nginx using as follows, note the lack of forwarded ports:
sudo docker run -it --rm nginx:alpine
If we then run lsns
:
NS TYPE NPROCS PID USER COMMAND
4026531835 cgroup 48 1 root /init
4026531837 user 48 1 root /init
4026531992 net 39 1 root /init
4026532189 mnt 39 1 root /init
4026532190 uts 39 1 root /init
4026532191 ipc 39 1 root /init
4026532192 pid 39 1 root /init
4026532211 mnt 9 4433 root nginx: master process nginx -g daemon off;
4026532212 uts 9 4433 root nginx: master process nginx -g daemon off;
4026532213 ipc 9 4433 root nginx: master process nginx -g daemon off;
4026532214 pid 9 4433 root nginx: master process nginx -g daemon off;
4026532216 net 9 4433 root nginx: master process nginx -g daemon off;
Here we see two namespaces, there’s one namespace running as PID 1. This namespace is the one where the host system is running. The second namespace is running as PID 4433, and this is the namespace of the nginx container we started earlier. Also note that there are multiple types of namespaces, we can see that nginx has a net
namespace for example.
With this info, we can use nsenter
as an alternative docker exec
:
sudo nsenter -t 4433 -m -u -i -n -p
root@4f3c58da395d:/#
root@4f3c58da395d:/# nginx -v
nginx version: nginx/1.23.1
Using the flags above, we switched to the nginx namespace for all namespace types. This is essentially a docker exec
, but doesn’t bring us much closer to the goal of bringing our own system into the container. We passed -m
to nsenter
, also changing our mounts to the mounts of the docker container. If we omit the -m
flag, something interesting happens:
sudo nsenter -t 4433 -u -i -n -p
[root@4f3c58da395d tmp]#
[root@4f3c58da395d tmp]# nginx -v
-bash: nginx: command not found
A few things are of note here. We can see that we are in fact in the container namespace based on the hostname (@4f3c58da395d
). But we’re also in whatever folder we ran sudo nsenter
from. Additionally, this time around, nginx isn’t a known command. This is because we’re still using our host filesystem as the root filesystem, NOT the container’s root filesystem. Everything else is the same though, so if we curl
localhost we still have an nginx running. We didn’t bind any ports from the container to the host, so shouldn’t be able to reach nginx, yet:
[root@4f3c58da395d tmp]# curl localhost
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
....
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
So now, anything we have installed in our host system is available in our container, and we can talk to the processes, ping
, traceroute
, etc. from within the container’s context. One ability we’re still lacking is to inspect files in the container itself. This we can do by figuring out where the container’s rootfs is mounted from:
[root@4f3c58da395d tmp]# df -hT
Filesystem Type Size Used Avail Use% Mounted on
....
overlay overlay 251G 22G 217G 10% /var/lib/docker/overlay2/ec60940bc9acc98384d68fa0961f55998d1504178fb3df9af1b3dbbe0aba726d/merged
In this case, we have one overlay visible. This is the overlay that is used by the docker container as a rootfs, and we can simply cd into that:
[root@4f3c58da395d tmp]# cd /var/lib/docker/overlay2/ec60940bc9acc98384d68fa0961f55998d1504178fb3df9af1b3dbbe0aba726d/merged/etc/nginx/conf.d/
[root@4f3c58da395d tmp]# ls
default.conf
Now you can edit the nginx conf for this container, without ever installing vi/nano
or any other text editor.
Putting it all togther, we can run ping
from our node container at the start:
❯ sudo lsns
NS TYPE NPROCS PID USER COMMAND
...
4026532324 mnt 1 5357 root node
4026532325 uts 1 5357 root node
4026532326 ipc 1 5357 root node
4026532327 pid 1 5357 root node
4026532329 net 1 5357 root node
❯ sudo nsenter -t 5357 -u -i -n -p
[root@2e9acb5cd07a tmp]#
[root@2e9acb5cd07a tmp]# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=112 time=53.6 ms