Single File Containers

It has been a little while since I’ve done anything interesting with Kubernetes. Curious about what information a Pod has available to it, I decided to take a stab at inspecting a Pod’s contents; especially what a Pod knows about itself and the cluster.

Most container images have a lot of files and may declare, or set, their own environment variables. This would be too much to sift through just to find out the difference between what a container image’s author created and what was added by Kubernetes at runtime.

While there is documentation on what information is available from within a Kubernetes pod, there is also a lot of potential complexity and the answer may not always be straight-foward.

Essentially, I wanted an easy, minimal container that can simply run an executable file or script. My goal was to log the filesystem tree and environment variables. To make things a bit more complicated, I wanted to do this from my MacOS laptop.

At first, I decided to look at Buildah and the scratch image. It was an interesting tool I experimented with while working at Red Hat on Quay. Unfortunately, it (understandably) does not have support to run on anything besides Linux without some work-arounds.

My next approach was using the infamous Packer application. While I have never used Packer, I’ve heard many great things and its one of those tools that seemed to follow me through my career. While Packer appears to run on MacOS, it does not support the scratch image I need to produce a single-file container. That’s okay – the docs looked great and I’ll find some other purpose for it down the road.

After realizing that I just needed the scratch “image”, I was curious if I can simply define a Dockerfile that uses it. Not only is it possible, it is a documented Docker feature! The answer was right in front of my face. So off I went to see if this would even work.

All containers, which are the meat of a Pod, keep one or more processes contained; hence the name. When you define a container image, you need to specify some sort of entry point or otherwise tell your runtime (e.g. Docker) what to execute. Typically, I could just throw together a quick shell script and set that as my entrypoint; it would make the subsequent calls to list the environment variables. I would also be able to copy or install the tool tree to print the contents of the filesystem as seen from the container’s perspective.

One important thing to note about using Linux containers is that you do not have access to anything outside of the container unless they’re explicitly provided (e.g. mounted volumes and environment variables); this includes dynamically linked libraries. In summary, if you want to echo or printf() within a container then you’re typically going to bundle a lot of userland. This is why most container images are large and have a lot of files.

There is a fairly popular language called Go which has a great feature: it can easily statically compile code – no additional files are required to run its executables. It also provides its own standard library which is packed full of useful features including logging, reading environment variables and working with filesystem. Go can also compile cross-platform executables, which would be pretty useful here since I am compiling Linux software on my Mac. Quick side note: Docker on Mac is really just using a Virtual Machine running Linux in the background.

Out of habit, I decided to skip Go’s cross-platform compilation and instead take advantage of a Docker feature I really appreciate called multi-stage builds. For the uninitiated, its a great way to use one image for building and another image to run or distribute an application. In this case, I am going to use the community golang (the combination of go and language) image to compile my program, throw away everything besides the executable, and then run it within an otherwise empty container.

If this is confusing, don’t worry – I am running through this information quickly and touching on topics that I’ve spent years learning. Perhaps one day it could make for a good presentation or talk. Regardless, I believe the following two snippets will make a lot of sense if you play around with it.

Dockerfile

# This is a multi-stage build.
#
# Useful Docs:
# - https://docs.docker.com/build/building/multi-stage/
# - https://docs.docker.com/build/building/base-images/#create-a-simple-parent-image-using-scratch 
#
# The first stage, which uses the "golang" image,
# simply compiles the application and stores it in /src/kurtismullins/.
#
FROM golang
WORKDIR /src/kurtismullins/
COPY myapp.go .
RUN go build -o myapp myapp.go

#
# The second stage, based on "scratch",
# copies the executable from the previous container
# and declares that it should be run by default.
#
FROM scratch
COPY --from=0 /src/kurtismullins/myapp /usr/local/bin/myapp
CMD ["/usr/local/bin/myapp"]

myapp.go

package main

import (
	"fmt"
	"os"
)

func main() {
	// Print all Environment Variables
	fmt.Println(os.Environ())
}

With those two files in the same directory, I used Docker to build and run the container. The test was a success!

$ docker build . -t myapp
$ docker run myapp
[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=af31fb5e860c HOME=/]

My next steps would be to add logging of the filesystem and run this within Kubernetes. I’ll save that for another day.

Food for thought: This general approach may be useful when debugging or learning more about other environments where containers are first-class components such as AWS Lambda, Google’s Cloud Run, or even the latest generation of hosting services such as fly.io. Those platforms have their own requirements which may, or may not, be compatible with a minimal FROM scratch image.

Happy Hacking!