C++ build hell

2021 October 14

When I sit down to write C++, I want to think in the land of C++. In this land, there are functions, classes, and namespaces. I don't want to be distracted by the differences among operating systems, compilers, package managers, and build systems, or of the details of paths, compiler flags, and linkages. I just want to write my C++, with the option to use someone else's C++ (especially if that someone else is past me).

That said, I want my C++ to enjoy the widest possible audience by meeting people where they are. I want to make it easy for them to use my C++ regardless of their choice of platform, compiler, package manager, build system, or linkage, but I don't want to be mired in the details of how to support them. I want a set of easy-to-follow build system patterns that bridge this gap, letting me spend most of my time in the land of C++, while still serving the overwhelming majority of real world use cases.

This happy experience is the one I enjoy in the package ecosystems of Java, Python, JavaScript, Haskell, and Rust. I want it for C++ too.

Over the next several posts, I will walk through the conceptual framework I've built to help me think about this problem. This first post introduces what I am calling C++ build hell.

Module Models vs Package Models

Let me start by defining and contrasting module models and package models.

A module model defines how code is shared within a language. A module is a collection of exported symbols that can be imported by other modules. In this abstract definition, I am calling those collections "modules" regardless of what they are called in the language itself. Java calls them packages. Python calls them modules or packages. C++ calls them header files or modules. Modules are just vessels for carrying reusable code, and the module model defines how they are constructed and connected.

A package model defines how code is shared between different projects. A package is a collection of exported libraries (and executables and data) that can be linked by other libraries and executables. Packages are just vessels for carrying reusable code, and the package model defines how they are constructed and connected.

These ideas are language-agnostic. Each language specification defines its own module model, but its surrounding ecosystem defines the package model. Some languages have multiple competing package models (JavaScript, Python, C++), and it often leads to headaches.

C++ has many different package models. There are Debian and Red Hat and pkg-config packages. There are Conan and vcpkg packages. There are CMake packages found with Package Configuration Files, there are CMake packages found with Find Modules, there are CMake packages found with ExternalProject, and there are CMake packages found with FetchContent and/or add_subdirectory.

Build Model

All C++ artifacts (i.e. libraries and executables) are built in the same general way:

Compile one or more translation units into objects.
Link those objects into an artifact.

cc ${acflags} -o a.o -c a.cpp
cc ${bcflags} -o b.o -c b.cpp
cc ${ccflags} -o c.o -c c.cpp
...
ld ${ldflags} -o ${artifact} a.o b.o c.o ...

Any variation in this pattern produces a different build of the artifact. A variation is a difference in any of the following build parameters:

compiler
preprocessor definitions
include paths
compiler flags
set of translation units
linker
linker flags
linked libraries

An option represents a set of mutually exclusive variations, or choices. Every build corresponds to a different set of choices for the options available.

Differences in builds of a library can affect how dependents of that library are compiled or linked because they can change its application binary interface (ABI). Preprocessor definitions can conditionally compile classes to different sizes. Compiler flags can change calling convention. Static and shared libraries must be linked differently. Header-only libraries that aren't "built" or "linked" still have this problem because the ABI they manifest in dependents can be affected by preprocessor definitions and compiler flags.

Different builds of a library may be binary compatible, i.e. they may have the same ABI where it matters, but predicting that requires intimate knowledge of the specific compiler and linker being used, and thus the meanings of any flags, as well as of the library source code, and thus how it responds to preprocessing, compiling, and linking. That is unreasonable to expect from a general build system today. The conservative assumption is that every change in build parameters changes a library's ABI.

Even if two builds of a library are binary compatible, it does not mean they are interchangeable in the eyes of dependents. Their differences may satisfy materially different demands. For example, one dependent may want a build with no symbol table to optimize for size in a storage-constrained environment, while another may want a build with a symbol table to aid debugging. These might even be the same dependent at two different times. Thus, it is important for dependents to distinguish dependencies by build and not just ABI.

Build Hell

The build model of C++ creates an incongruency between modules in the language space and libraries in the package space. Within a package, a collection of different builds of the same library is indistinguishable from a collection of different libraries, meaning there may exist multiple libraries in a package corresponding to the same module in the language, reflecting differences in how the module was compiled or linked or both, and thus differences in how its dependents must be compiled or linked or both.

Dependents of a library want to choose which build they link against. They may want to link against most dependencies' release build, but against one dependency's debug build, to help when stepping through that dependency's code. Or they may want to link against most dependencies statically, but against one dependency dynamically, to enable updating that dependency after installation while inlining the others.

There are a few complications, however. The first is diamond dependencies. If the dependency graph of an artifact has a library L with more than one dependent, then the builder must ensure that every dependent of L is linked against the same build of L. This is a special case of dependency hell, where common dependencies must not only be the same version, but the same build of that version.

The next complication is the difference between packages that build the library before or after knowing which build a dependent wants, a.k.a. the difference between binary packages and source packages. With binary packages, the build chosen when the package was installed might not be the one the dependent wants. The package may have been installed long before the dependent came along. For binary package authors to provide the same experience to dependents as source packages, they need to publish one or more binary packages, each installing one or more builds, that altogether span the set of all builds. This set grows combinatorially with the number of options and dependencies a library has because it is the cross-product of all of its options (which are sets of choices) and the sets of all of its dependencies' builds (which are transitively cross-products of still more sets).

The last complication is the fact that some ABI-changing compiler flags must be shared by all translation units, whether they depend on the library that introduced the flag or not. Examples for GCC include -freg-struct-return, which changes the calling convention, and -fshort-enums, which changes the size of enumerations. It is possible that no build installed by a binary package will be binary compatible with flags chosen by a dependent.

In the future, we may see every artifact building its entire dependency graph from source. It's the only way to guarantee binary compatibility for any arbitrary set of build parameters among dependencies.

In the next post, I expand the build model with rules for connecting dependencies, and explain how it challenges build systems.