Files
second-brain/Clippings/Containers has the pendulum swung too far.md

38 KiB
Raw Permalink Blame History

title, source, author, published, created, description, tags
title source author published created description tags
Containers: has the pendulum swung too far? https://itnext.io/containers-has-the-pendulum-swung-too-far-208ad02a6b42
Niels Cautaerts
2024-08-15 2024-10-29 Once upon a time, not so long ago, the techno-sphere was divided into two hemispheres: development (Dev) and operations (Ops). The job of the people in Dev was to hack together an application. When…
clippings

Containerization has revolutionized the software industry, but using them blindly for everything without considering their drawbacks or alternatives leads to poor outcomes.

[

Niels Cautaerts

](https://medium.com/@cautaerts?source=post_page---byline--208ad02a6b42--------------------------------)

[

ITNEXT

](https://itnext.io/?source=post_page---byline--208ad02a6b42--------------------------------)

Why we use containers

Once upon a time, not so long ago, the techno-sphere was divided into two hemispheres: development (Dev) and operations (Ops). The job of the people in Dev was to hack together an application. When they were done, they threw it over the wall to Ops who had to figure out how to deploy it and keep it running in production. This way of working was a nightmare on many levels, which is well described in the fiction book The Phoenix Project by Kim, Behr and Spafford. The technical reasons boil down to:

  • Inconsistent environments: Dev runs and tests their application in some environment which does not match the production environment where Ops deploys it. Breakage happens. No one knows how to fix it. Dev deflects with the iconic “it works on my machine”.
  • Deployment complexity: Deploying a complex application may require a lot of steps. These are unique for each applications and difficult to automate.

The result is that a deployment is fragile and complex, making a new release to production risky. Combine this with all sorts of process and organizational issues, and you get an unacceptably long release cycle.

Containerization is arguably the key technological breakthrough that bridged the Dev and Ops divide, thereby laying the foundations for the DevOps field. Containers solve the problems above and more by promising the following:

  • Build once, run anywhere: a container image bundles an application and all of its dependencies. The only dependency required on a host system is a container runtime. The container image is static, and the container process is isolated from the host OS. This means that whether Dev spins up the container on their laptop or Ops runs it in production, the behavior should be identical. With containers, “it works on my machine” should mean “it works on any machine”.
  • Standardization of deployment process: the container build process is encoded in a definition file (e.g. Dockerfile), which is custom for each application. All deployments involve building a container image, pushing it to a registry, pulling it in the production environment, and spinning up a container from it.

Containers substantially de-risk, simplify and standardize the deployment process of most applications. The hard Ops work in deploying an application is reduced to writing a correct definition file, which is where Ops meets Dev. All this enables much faster iteration and shorter release cycles.

The attractive features of containers has lead to their widespread adoption for all types of use cases, so much so that the container has become the de-facto standard for any “runable unit”. But like any technology, containers also come with drawbacks, which makes them less fit for some purposes where people still stubbornly use them. In the rest of this article, I will highlight some of these issues in no particular order, and discuss some alternatives.

Challenge 1: Bloat

With popular mantras like “storage is cheap” and “we can always scale vertically in the cloud”, it seems that very few in the software industry care about efficient utilization of resources. With this attitude, naive container use can quickly lead to a spiral in disk and bandwidth consumption.

The core concept behind a container is that it packages ALL dependencies with the application into a single artifact. The only thing that is shared between the host and the container is the (Linux) kernel, which is a key distinction that separates it from a virtual machine. That means container images tend to take up a disproportionate amount of disk space, as well as bandwidth to push/pull them from A to B. This is further exacerbated by the philosophy that each application should live in its own container, and thus have its own image.

For a sizeable and complex application, shipping an artifact that is larger than strictly necessary may be a worthwhile trade for ease of deployment and consistency guarantees.

However, is building, storing and shipping a 12 GB container image, containing a mini-OS, a Python interpreter, and a bunch of packages, really the best way to run a simple Python script in production? I do ponder this question when I need to push this type of image from my asymmetric home internet connection that is capped at 10 Mbps upload. This situation is common, and feels a lot like sending a cargo container with a single banana to China.

There are a few strategies to deal with the pains induced by container bloat: bloating infrastructure to match, image optimization, and effectively using cached layers.

Infrastructure bloating is basically ignoring the problem and instead scaling up the supporting infrastructure to whatever containers demand. A start is getting the most specced-out Macbook. Even better is building your containers in the cloud where you have a more suitable network connection to push and pull images, preferably as part of a CI/CD flow. The optimal configuration seems to be using the specced-out Macbook to run a browser in order to monitor CI/CD runs.

Image optimization, i.e. stripping down a container image to the bare minimum, is an art that is mastered by few developers. Usually this is done by those who provide popular base images. The effort tends to go to waste once the dev dumps in their app and its dependencies.

Cached layers are a feature of some containerization technology like Docker. Instructions in a Dockerfile are run from top to bottom, and each instruction produces a “layer” that is cached. Layers can be pushed and pulled independently. When an instruction in the Dockerfile is changed, the build can start from the cached layer from the instruction before. Layers are cumulative, so a change early in the definition file requires a rebuild of all subsequent layers. By smartly organizing the instructions in the Dockerfile, devs can minimize container build and push times. You can do even better with multi-stage builds.

Still, these strategies are no panacea. A key problem is that containers are popular for deploying applications written in languages that severely exacerbate container bloat: Python and JavaScript/TypeScript. The container build definitions usually adhere to the following recipe:

  1. Start from an optimized base image containing the interpreter
  2. Install system packages
  3. Install app dependencies
  4. Copy in app source code and install the app
  5. Define a default entry point

The problem is situated mostly in step 3. Modern Python and JS apps have a lot of fat dependencies that all need to be stored in the container. Even if the app only needs a tiny subset of the functionality provided by some library, you need to ship the entire library with your app. Thus, layer 3 tends to be beefy.

This is a problem, because adding new dependencies happens quite frequently. If you build an environment containing your dependencies on your dev machine, incrementally updating your environment is no big deal. However, in container land there is no built-in intra-layer caching mechanism. This means you have to rebuild the container from layer 3 onward each time you add a new dependency. That means downloading all other dependencies again, recreating your entire environment, and pushing all this data again on deploy. If we revisit our cargo container analogy, its like scrapping our entire container and manufacturing a new one when we want to put a sticker on the banana that were sending to China.

So far, the industry has mostly responded to container bloat with system bloat and tolerating slightly annoyed devs. However, in the current age of “AI”, the industry will be forced to deal with smarter ways to manage basic applications built on top of CUDA, PyTorch and transformer models, where container images easily exceed 10 GB.

Challenge 2: Reproducible at run time, not at build time

Once a container image exists, it is an immutable artifact that will always produce identical behavior when run as a container. However, it is not necessarily true that rebuilding an image from a definition file produces an identical image.

Non reproducible builds can be problematic for many reasons. For example, suppose you work on an application that needs to be deployed as a container. Your build works just fine. A new colleague joins the project and their build breaks production. You can spend many hours debugging their code before realizing the problem was actually a non-reproducible build step, which you couldnt reproduce because you had a working cached layer.

We encountered this situation in the past, where a build step in the Dockerfile installed setuptools with an unpinned version. A new collaborator built and deployed the app and everything broke. It took quite some time to figure out the build had installed a borked version of setuptools.

Container images are also not stored forever (for reasons described in the bloat section). Often it is assumed there is a one-to-one relationship between a container image and the definition file in a git commit. But that is only true if the built process is reproducible, which is rarely the case.

While pinning of app dependencies with lock files is common and mitigates the worst issues, many definition files still contain apt-get update && apt-get install ... as part of their build steps; this is not reproducible. Base image versions are also not always precisely pinned. It is often assumed system packages and base images are stable enough to avoid breakage. But why take the chance?

Challenge 3: Awkward Dev experience

Devs like to minimize friction in their development process. But if the deployment target is a container, developing for it always introduces what I can best describe as “awkwardness”.

The ideal scenario is developing and testing the code directly in the container that will be deployed. However, a container image is immutable, and any change you make to files in a running container is not persisted. For de-bloating purposes, it may also not be wise to install all your favorite dev tools into every container.

A common pattern is creating a container where you employ a volume mount to mount your project directory (and potentially your virtual environment) on your host system into your container. With this approach, you can simultaneously develop your app on your host system using your favorite dev tools, whilst being able to see the changes reflected in a running container without rebuilding.

Unfortunately this approach also requires you to maintain multiple definition files: at least one for the container you use for development, and one for the real container that will be deployed. Theres a good chance these will drift over time. Non-reproducible builds can become an aggravating factor. Sometimes a rebuild of both containers will be necessary, for example when you need a new system level package. This can be annoying.

As a consequence, many devs simply develop in a local environment created on their machine, and then package up what is needed in a container. This way of working sustains a gap between production and the environment in which the code is developed, which can result in issues down the line. For example, the container may be missing a necessary system package that was installed on the development host. This issue will likely not be detected until deployment in a test environment, because unit tests are typically run outside the container on the dev machine.

On dev containers

A relatively recent development is the concept of a “dev container”, whereby a complete development environment, including IDE and dev tools, is created inside a container. They may also be referred to as “cloud IDEs”.

The proposed benefits are that all devs work in identical environments, that this environment is not polluted by the host configuration, and that development can be performed through the browser on a remote machine. Potentially the biggest advantage of dev containers is that they can be run directly in the test or even production environment, which allows devs to test their code against external dependencies that are not accessible from their machine e.g. a database.

At the same time this paradigm introduces significant systemic bloat: effectively we create a separate development host for each project!

Additionally, when employing dev containers, you face a dilemma: should the dev container persist data or not? If, for example, you persist the virtual environment and project directory in a volume, the environments inside the dev containers of different developers will drift over time. However, if you do not persist any data the developer experience is negatively impacted: the dev container will have to be rebuilt on every change to the environment, and code changes can only be persisted by committing and pushing.

Finally, dev containers do not entirely bridge the gap between the development environment and production, since the prod container will have to be built from inside the dev container. Depending on who you ask, Docker-in-Docker represents the greatest thing since sliced bread or the worst thing since Scaled Agile For Enterprise.

Challenge 4: Limits to portability & consistency

Containers are often compared to cargo containers, but the comparison is not entirely accurate.

The cargo container is at the core of international logistics because they are standardized: they have identical dimensions, ways to open them, ways to stack them, and ways to connect them to means of transportation. All of logistics, from freight trains, to harbor infrastructure, to ships, is tailored to this standard. And so, you can ship anything across the globe as long as you can fit it into a shipping container.

Analogously, the software container allows you to deploy any software to anywhere a container runtime is installed, and global cloud infrastructure has somehow converged on OCI standards and adjusted accordingly.

However, the claim that containers run everywhere is not entirely true. Yes, you can run the same container on Linux and Mac OS, but that is only because the container runtime on Mac OS has a built-in Linux VM. Most containers are built to run on the Linux kernel, and in the past most machines had a X86/amd64 CPU architecture, so containers could pretend to be portable to any machine. With the increased popularity of the ARM64 architecture, containers have become a lot less portable. This problem is “solved” using additional levels of virtualization and/or cross-platform image building. However, these solutions can become very tricky, since all compiled artifacts inside the container also need to be compatible with the target CPU architecture.

Additionally, a critical difference between a cargo container and a software container is that the former is a closed system, whereas the latter is not.

While a container runtime is the only dependency required to start the container, it is far from the only thing necessary to run the application successfully. To name a few, an application may depend on:

  • other applications, e.g. a database
  • external state, e.g. files on the file system or in remote storage
  • network access

Containers are ideal for stateless applications. Unfortunately, most practical applications rely on some form of external state or means of storing state. That means you need separate systems to manage state, and this implies practical limitations to the portability and consistency of a container. For example, an application may need access to a database, which is not accessible from a developers machine. There are tricks to mock external services, like using a local PostgreSQL container populated with dummy data, but again this is not the same as running the container in production.

Many applications get complex enough that they consist of several communicating containers. In fact, with the hype around micro-services, this often happens by design. Thus, very quickly you need additional tooling to organize deployments. On the complexity spectrum, these range from docker-compose to Kubernetes. Thus containers invite a new kind of systemic complexity to sustain them; this is discussed in the next section.

Of course, limited portability and consistency are usually a lot worse without containers. The point is that containers can only deal with a subset of the issues that differentiate environments; you can not put the entire world in your container.

Challenge 5: Complexity shifts from application deployments to systems and platforms

This one may be less relevant to developers, but it is certainly relevant to operations people and organizations as a whole.

The Ops work of yore, involving juggling complex app deployments, scheduling releases, and tracking server configurations, is mostly a thing of the past thanks to containers and infrastructure as code (IAC). However, it has been replaced with immense system and platform complexity to support containerized workloads. Someone has to manage this complexity.

The industry standard for container orchestration and deployment is Kubernetes, a massive open source project initially developed at Google. At this point, the complexity of Kubernetes is a meme. The Kubernetes iceberg goes deep, and once you have a decent grasp on the basic concepts there is still the insurmountably large ecosystem of tooling built around and on top of this technology. To deploy relatively simple containerized applications, the supporting infrastructure has become a Frankenstein monster.

This opens up lucrative business opportunities. Few organizations have the personnel and resources to manage a sprawling “cloud native” platform. Thus vendors step in to convince you that you need this type of platform, but that they will take over the heavy lifting of integrating and maintaining components. Thus, paying for their “managed” services is a necessity, and certainly a bargain compared to the DIY approach.

Cloud providers are the biggest winners, offering their own flavors of Kubernetes and proprietary “serverless” services. Additionally, countless startups have sprung up to offer services of integrated open source components that hide complexity behind a simpler interface. In theory most of the technology is open source, and you can migrate to a competing vendor or embark on the DIY approach. In practice, theres always just enough proprietary magic woven through the products to make a lift and shift impractical. The service might be cheap at first, but since the vendor is controlling the ships wheel of the Kubernetes vessel, they will eventually drive up the price to the highest point you will tolerate.

It almost undeniable that in most cases Cloud and vendor managed products offer better value for money than building and maintaining your own cloud native platform. The question is: does every organization really need this type of platform? Or is this demand induced by technology choices we dont dare to question anymore, like containers?

… and seeps back into deployments

In parallel to platform sprawl, container deployments themselves have become mindbogglingly complex. While the container is presented as a standard runnable unit, whatever application is inside can not be abstracted away. Most applications still need to be configured, which is done by injecting environment variables (or by using ConfigMaps in Kubernetes). This configuration is application specific.

These days, complex multi-container app deployments are defined in Helm charts, which combines YAML config with Jinja templating, because writing out all the raw Kubernetes YAML manifests would be too cumbersome. The format of the configuration YAML is specific to each Helm chart and can easily span thousands of lines. Hence, customizing a deployment requires quite a bit of study; a process that must be repeated for each application you deploy.

At some point one might wonder whether containers are still achieving the aims of simple, standardized and reproducible deployments, and whether the platforms that host them simplify operations. It is ironic when technologies that aim to solve old woes come full circle.

Alternative 1: Statically linked binaries

Those who are high on the cloud-native Kool-Aid will find this argument absurd, but please allow me to cook and save your judgement for the end.

With containers we aim to produce a portable self-contained runnable unit that does not rely on external dependencies, save for the container runtime.

Operating systems have had a feature like this for decades without the need for a container runtime: statically linked executables. Golang and Rust applications tend to compile down to a single binary that contains all of its runtime dependencies. Get this file onto any system and just run it; nothing but the OS kernel required.

Statically linked binaries are more bloated than dynamically linked binaries, but this added bloat dwarfs in comparison to the bloat of a container with a simple Python app. Additionally, since we prefer to avoid dependency hell, dynamically linked binaries would these days anyways be deployed as a container with all runtime dependencies included. So why not skip the container wrapper all together and aim for statically linked binaries?

Not only do statically linked binaries drastically reduce bloat compared to containers, they make reproducible builds more straightforward, simplify the dev experience, have the same limits to portability as containers, and dont require complex supporting infrastructure.

As long as application dependency versions are well managed (i.e. with requirement and lock files in version control), the compiler version is pinned, and compiler flags are fixed, a build should be reproducible and should map directly onto a git commit. No subtle trickiness in container definition files.

With statically linked binaries, there is no gap between development and production. As part of the development process, a dev must compile and run the app on their machine. Going to production simply means getting that same binary onto the production system. While repeated compilation can be annoying, Golang compilation times are short, and it sure as hell beats repeated building of containers.

A binary has the same limitations on portability as a container: they work on a specific CPU architecture + OS combination, and may depend on external state. Cross compilation of a binary is typically simpler than multi-platform container builds, since cross compilation is fully managed by the compiler. In modern languages like Golang, this feature is well supported.

Containers invite complex systems. Since they aim to be completely isolated from the host on which they run, a lot of additional tooling and abstractions are required to undo some of this isolation when it is required. For example, when a container needs to interact with other processes, interact with data on the host or external systems, coordinate communication among containers, inject application configuration, etc. we need to make “holes” in our container using abstractions like volumes.

Thus, behemoths like Kubernetes are born to serve as a pseudo OS for containers. Complexity invites misconfiguration and security problems; just Google Docker/Kubernetes privilege escalation.

By contrast, a process spawned from a binary has a decent default level of isolation provided to it by the OS, and yet does not require complex machinery to interact with other processes or data on the host. Depending on the application, this reduced level of isolation can be a good or a bad thing, but often it seems we jump through hoops to do things with containers that a regular process on the host could have easily done.

Of course the comparison between a container and a binary is not entirely fair. Containers have isolation features that are difficult or impossible to replicate with a regular process. Containers have an isolated filesystem, isolated process namespace, isolated networking, decoupled user and group IDs, and more. Regular processes all see the same filesystem, other processes running on the host, the same network, and the same users and groups. Some scenarios may call for the isolation level of containers. But is this always the case? And does this absolute isolation really simplify deployments? Since we often want to connect and exchange data between the isolated systems inside containers, I would argue no. Especially in the age of cloud, where everything is running on a VM anyway, do we gain much by inserting this additional virtualization layer?

Instead, it seems that containers are primarily used today as a bloated pseudo-binary for applications written in languages that dont have real binaries. While this democratizes production to a wider range of applications, which can be great for legacy applications, it does not incentivize writing new applications in more performant, leaner languages. Is it a good thing that we can dump a 10 GB box of Python spaghetti into production without a second thought about resource utilization? The “deliver business value fast” crowd will say yes, the devs who have to deal with the fallout later down the line and the planet may disagree.

Alternative 2: Nix and NixOS

Nix is a unique package manager for Unix systems, with very attractive features that show promise in closing the Dev and Ops gap. It behaves in a very different way compared to typical package managers like apt or homebrew. NixOS is a Linux distribution built on top of the Nix package manager.

For those unfamiliar with Nix and NixOS: a not so quick debrief.

What is Nix(OS)?

Nix aims to guarantee fully reproducible builds, environments and configuration. These are all expressed declaratively in a domain specific functional Nix language, which looks a bit like JSON but functionally behaves more like Haskell.

Nix expressions can be used to produce “derivations”, which correspond to software packages in most cases, but can also correspond to user environments, system configuration, or even an arbitrary collection of files. Nix ensures that outputs are always fully reproducible and do not depend on any system level software, by building derivations in isolated environments and enforcing that all inputs (files, dependencies, other derivations) are explicitly declared.

Nix makes dependency management a breeze. All outputs get stored in the “Nix store”, with a file path that contains a hash calculated from all inputs and the declared build process. This ensures that any change in inputs or configuration results in a new “version” of the package that is stored on a different path. Therefore, multiple versions of packages can perfectly coexist with Nix. No package depends on system level software, all dependencies down to different glibc versions exist in the store.

An environment or shell configuration consists of a collection of packages that exist in the Nix store plus potentially some configuration and hooks. This offers a much more comprehensive alternative to things like Python virtual environments. A Python venv only stores Python packages and modifies the PYTHONPATH, but it can not manage the Python interpreter or any other system package. Nix can create a complete and reproducible environment that is fully isolated from any system level packages.

Most package managers permanently and irreversibly mutate the state of a system. For example, when you run apt upgrade, old versions of software are typically removed and replaced with new ones. This is not the case with Nix, as the Nix store is immutable—old versions of packages are never overwritten. Instead, they coexist in the store until they are explicitly cleaned up to save disk space. In Nix, changes in system behavior are managed by intelligently updating symlinks that point to the active versions of packages. This approach ensures that all changes are non-destructive and can be easily rolled back to any previous state, a feature that Nix supports natively.

Finally, NixOS is a Linux distribution built on top of the Nix package manager. All system configuration is expressed declaratively in a configuration.nix file. This means that the state of your entire OS can be managed in version control. All packages, services, and all system settings, become fully reproducible.

How do Nix and NixOS offer an alternative to containers?

The unique nature of Nix allows us to run different packages in isolated environments on the same machine, just like containers allow us to do. Each application can have its own set of dependencies (some of which may be shared and exist once in the Nix store), as well as configuration and environment variables. We can run them simply as native processes on the host. If the host runs NixOS, we can declare all these services and their configuration in the systems configuration.nix file. Therefore the OS acts a bit like a very lightweight, single machine Kubernetes cluster.

Nix(OS) can address some of the challenges created by containers:

  • Less bloat

Nix is more bloated than the average system package manager, but it is only as bloated as it needs to be to guarantee reproducibility.

On a developers machine, one can draw an analogy between Nix derivations output and container image layers, as both act as a form of cache. However, there is a key difference: Nix organizes its cache at the package level, connecting derivations via a dependency graph, which allows for fine-grained sharing and reuse of individual components. In contrast, container image layers correspond to a series of imperative build instructions and are organized in a linear chain. This means that in Nix, changes to a single package result in only that package being rebuilt, whereas in containers, a change typically requires rebuilding all subsequent layers in the chain.

Instead of shipping around massive monolithic artifacts like containers, we only need to ship Nix code and our application code. The production environment can recreate an identical environment by pulling the Nix code, and then Nix can do the hard work of syncing the environment with the definitions. To avoid that the production servers need to rebuild outputs (which could involve downloading sources and compiling), a recommended architectural pattern is using builder servers and a centralized cache, see this book.

  • Reproducible by design

This one is self explanatory. Nix was built from the ground up to guarantee reproducible builds. It is an afterthought for Docker containers.

  • Superior dev experience

Thanks to Nix, devs can develop in an isolated and reproducible environment directly on their machine; no need for container hackery or dev containers. Each project has an environment definition file (e.g. shell.nix) which is tracked with version control. It ensures that each dev working on the project works in exactly the same evolving environment and that the environment can be rolled back to any commit. It is a much less painful process than being forced to rebuild containers throughout the development process.

Additionally, we can re-use the same Nix code for deploying the application in production.

  • Similar portability characteristics

Containers need a container runtime to function, Nix expressions require the Nix interpreter to be installed on each system. The best results are achieved when the host OS is NixOS, especially in the production system. This is not required for dev machines, though Linux is probably preferred.

  • Simpler systems and platforms

With Nix(OS), we can achieve many of the same characteristics that make containers desirable, like isolation and consistency, using simpler primitives. Running applications directly on the host means we can cut out an entire layer of abstraction that adds complexity which is not always necessary. Simpler systems means cheaper and more secure systems.

Nix ♥ containers

We have so far been discussing Nix(OS) as an alternative to containers, but this need not be. If deployment infrastructure for containers is already in place, it can be hard to justify not deploying a container. In this case we can benefit from Nixs dockerTools, which allows us to build Docker containers in a declarative and reproducible way, as opposed to using the imperative and non-reproducible Dockerfile. Heres an interesting talk/blogpost about it.

The drawbacks of Nix

Nix and NixOS are about 10 years older than Docker containers (2003 vs. 2013), but the level of maturity is much lower. There has been a recent rise in popularity, but still pales in comparison to containers. The project was started as part of a PhD thesis and evolved into a community lead effort; it has nowhere near the level of corporate backing compared to Docker or Kubernetes. The community is unfortunately well recognized for its high levels of drama, especially recently, a trait it shares with the Rust community. Still, there are serious companies funding the project, and some opting to use Nix(OS) in production.

The Nix language is somewhat arcane with odd syntax, which hinders adoption. In order for devs to collaborate on a project, all of them must know the basics of the Nix language and adopt Nix tooling to manage their local environment. It can be tricky to assemble such a team, since the Nix documentation is scattered, patchy and often outdated. Often, the best way to figure out how things work is by sifting through the Nixpkgs code.

A major downside of Nix is that in order to benefit from it, everything in the project must be done “the Nix way”. Nix should be the only package manager that connects all packages, otherwise reproducibility is lost. “The Nix way” is often very distinct from the default way projects and their dependencies are managed in most programming languages, and thus requires a culture shift. Additionally, all outputs, like binaries, should have been produced through Nix derivations, which can be tricky when all you have is a precompiled binary from a proprietary application. The Nix philosophy necessitates a break with the Unix FHS, which means that dynamically linked binaries that expect an FHS compliant system dont work out of the box. There are some workarounds like nix-ld, patchELF and buildFHSUserEnv, but it can be fiddly to get these types of applications to run. Statically linked binaries are of course always simple to work with.

Summary

Containers have become ubiquitous in the modern IT landscape. Alongside it, a maze of infrastructure and tooling has sprung up to support it. Containers have a lot of benefits, but also a number of drawbacks which I discussed in this article, namely:

  • they tend to be very bloated for the small amount of functionality that is often shipped
  • making image builds reproducible was an afterthought
  • they make for an awkward dev experience
  • they are not always as portable as is claimed
  • they have given rise to extremely complex systems to run and manage them

These drawbacks can eventually have real consequences on the security, cost, and sustainability of our systems.

I explored two completely different approaches to develop and deliver software, and compared them to containers: deploying statically linked binaries and developing with Nix(OS). These can mitigate some of the aforementioned drawbacks of containers, but are of course no substitute for all situations. These approaches are not mutually exclusive with containers either: statically linked binaries can make containers much leaner, and Nix can help make container builds reproducible whilst improving the dev experience.

Containers and Kubernetes may be the answer for some situations; the main question I raise in this piece is whether they should be the only answer for everything.

Thank you for coming to my TED talk, feel free to share your opinion in the comments!

Opinions expressed in these pieces are my own. Check out my personal blog where I occasionally write about random things that interest me, and feel free to connect with me.