Trying Conan with Modern CMake: Packaging

2019 June 27

In a previous post, I asked the question: if I'm following best practices and popular conventions for CMake in my project, can I get Conan to download, build, and install my dependencies in the right place, for free? The answer was "yes". In this post, I'm asking: if I'm following best practices and popular conventions for CMake in my project, can I get Conan to package it for free? The answer is "almost". Let's see how.

What is a Conan package?

If a package manager wanted to provide only binary packages that are ready to be installed (i.e. copied into place) and used without waiting for a build step, then it would need to store a binary package for every possible application binary interface (ABI) that a user might want.

Because it is impossible to anticipate the full set of ABIs that may ever be requested, the Conan package manager chooses instead to represent every package with a recipe that can build a binary package from source on demand. Package authors can publish binary packages through Conan in addition to the recipe, and many packages choose to to do so for popular ABIs, but this is entirely optional and only exists as an optimization. Every Conan package must have a recipe, and creating a Conan package is as simple as uploading its recipe.

What ingredients do I need?

A Conan recipe is a single[1] Python module defining a special class[2] that subclasses ConanFile. When executed, it downloads sources, builds artifacts (e.g. headers, libraries, and executables), packages them in a directory, and computes some metadata. ConanFile has many attributes and methods, but we can limit our consideration to the subset most relevant to packaging:

  • source() is responsible for getting the package sources to build. By default, a Conan package is just its recipe, without any sources.

  • Between the source() and build() steps, Conan will install the package dependencies. Conan partitions dependencies into two categories:

    1. The requires attribute lists the dependencies required when using (and building) the package.

    2. The build_requires attribute lists the dependencies required only when building the package.

    The build_requires category is not transitively depended, but the requires category is.

  • build() does what it says on the tin: it builds the artifacts. Specifically, it builds them according to one ABI, identified by a hash computed by the recipe's package_id() method from a combination of factors. The default implementation includes these factors:

    • The recipe settings, which can be arbitrary but should almost always just match the default set: architecture, operating system, compiler, and build type (e.g. debug vs release). Generally, all of the Conan packages being linked together in a single library or application should have the same settings, and settings are generally known only by the package consumer (which means the recipe cannot give a "default" value for a setting).[3]

    • The recipe options which are entirely arbitrary, but conventionally include a Boolean shared option to decide whether to build a shared or static library. It is not expected that different packages being linked together will share the same option keys, much less their values. Besides shared, options are generally used for conditional compilation.

    The default implementation can be overridden by defining a custom package_id() method. For example, header-only libraries generally have just one "binary" package and thus need just one package ID.

  • package() copies the built artifacts to an empty "package directory" (in Conan parlance) that Conan creates. The binary package is effectively a compressed archive of that directory.

  • package_info() gives us a chance to describe the installation so that dependent packages know where to find the artifacts (headers, libraries, and executables) and how to compile and link against them (flags). Conan uses this information to generate build files for dependents.

    If we are using CMake, we should install a package configuration file (PCF) regardless, as a best practice, but if we want to let non-CMake projects depend on our Conan package, then we need to fill in cpp_info too.

Lastly, there are a few pieces of metadata that go into a Conan recipe:

Can I get a free Conan recipe?

If we are using CMake and following best practices and popular conventions, then nearly all of this information can be found in our CMakeLists.txt. Can we write a generic Conan recipe that can package such a CMake project?

Remember that CMakeLists.txt may require other files (e.g. through include or add_subdirectory), collectively called the CMake Sources. To gather ingredients, the recipe may need to inspect all of them. Because the paths to included files can be expressions instead of just literal strings, the recipe will have to effectively evaluate the CMake Sources. Instead of re-implementing the entire evaluation engine of CMake, it is easier to just invoke CMake to evaluate the CMake Sources and print the ingredients. During evaluation, the CMake Sources may inspect other non-CMake files (e.g. with file(GLOB) or if(EXISTS)) and branch their behavior based on the results, even terminating if a file is missing. Thus, to gather ingredients, the recipe might need every source file.

The only files that the recipe can use before it enters the source() step are files that it packages with itself in its exports. There are at least two important ingredients that must be gathered for Conan before it will even call the source() step: name and version. Thus, we will need to include all of the source files in the exports. I recommend this practice anyway, because it relieves the recipe author from having to write a source() method at all, and keeps the package available even when and where its version control is not.

While many of the ingredients can be queried at configuration time, many cannot. For example, it is presently impossible to get a list of the targets to be installed. Even if we could get their names, it is impossible to query their installation destinations, and even if we could get those, they might use generator expressions which are not evaluated until after the configuration step, which is when all the CMake commands are executed.

There is one very important ingredient that is never found in the CMake Sources: dependencies as they are named in the Conan ecosystem. For this reason, I will assume that our project is following the advice in my previous post for non-intrusively integrating CMake with Conan for dependency management: we have a conanfile.txt and use the Conan generators cmake_find_package and cmake_paths.

Thus, to generate a recipe, we'll have to employ a few different techniques together:

  • Some ingredients can be written generically to work with every project that, again, follows best practices and popular conventions[4].
  • Some ingredients will come from the conanfile.txt.
  • Some ingredients can be queried at the end of the CMake configuration step.
  • The rest of the ingredients can only be queried after the package is built and installed. Our recipe will need to query these ingredients from the built artifacts in its package() step.

Universal ingredients

Let's start with the ingredients that are pretty much the same for every project:

  • As I explained, we must set exports to '*' to make every source file available to the recipe. As a coincidental benefit, this lets us use the default (no-op) implementation of source().
  • We should use the standard settings: arch, os, compiler, and build_type.
  • We should have at least the shared option with the conventional default of False.
  • Because we have a CMake project, we can take advantage of the handy CMake build helper to implement the build() and package() methods.

conanfile.txt ingredients

A conanfile.txt is just a less-capable conanfile.py. It has only the ingredients necessary to prepare dependencies for consumption by the build system, but we still need these ingredients when building a package. At the least, We need to copy the attributes requires and build_requires from this file.

The generators should be ["cmake_find_package", "cmake_paths"], as I explained before, but if we're peeking into the conanfile.txt for other ingredients, we might as well grab this one too.

CMakeLists.txt ingredients

Many ingredients come from simple CMake variables:

The url is supposed to be a link to the version control repository for the package recipe, not necessarily the source code. In our case, because we are generating the recipe from the source code, it is effectively the same as our source code repository. By comparison, homepage is supposed to be a link to the package documentation. There is presently no standard CMake variable equivalent to url, but we can just define our own: CMAKE_PROJECT_REPOSITORY_URL. Because I expect most projects will set CMAKE_PROJECT_HOMEPAGE_URL to point to their source code repository where a README serves as the documentation, we can use its value as the default.

Similarly, there are no standard CMake variables for the license identifier or author. Let us use CMAKE_PROJECT_LICENSE and CMAKE_PROJECT_AUTHORS.

All of the standard CMAKE_PROJECT_* variables are set by the project command as CACHE variables. After configuring CMake in a temporary directory, we can read them with a CMake "script" that calls load_cache. To make our custom variables visible to that script, we must set them as CACHE variables too.[5]

How can we collect the options? CMake has a command for declaring Boolean options, but some projects define options as simply CACHE variables (and option itself is just a convenient shorthand for set(CACHE BOOL)). How can we distinguish between CACHE variables that are options and those that are not? There is not yet a standard or even a popular convention, to my knowledge. I will leave this as an open question for now.

Package configuration file ingredients

What's left? We need the installation destinations and compiler flags for package_info().

As of this writing (June 26, 2019), there is no way for a Conan recipe to declare highly granular targets like we can in a PCF. The cmake* family of generators produce a single mega-target representing everything in the package. For this reason, consumers should use the PCF installed by CMake instead of the Find Module (FM) installed by Conan, but for consumers not using CMake, and for packages that really do just export one target, we can make a best effort to fill in cpp_info.

That said, there is an effort underway to support granular targets, which Conan is calling components. We can detect whether Conan has this capability by checking whether cpp_info is a Python mapping, and fill it in differently based on that investigation. This way, we can support versions of Conan both before and after the roll-out of components.

The way to gather these ingredients is this: after the recipe has called CMake to install the PCF in the package() step, read it using find_package in a CMake "script", iterate over its targets, and read the relevant target properties:

One last hiccup is that there is no standard way to get the list of targets defined in a PCF. I have an open proposal to add a standard variable definition, a la <PackageName>_FOUND and <PackageName>_DIR, but until that is accepted and released, we will have to rely on convention. I have chosen to ensure my PCFs define <PackageName>_TARGETS.

Implementation

I have a proof-of-concept for this approach in a project I'm calling autorecipes. It is not 100% complete, some parts of the implementation do not (yet) match the design as written here, and it probably contains mistakes as I continue to learn more about the models of both CMake and Conan, but it appears to work for a sample C++ package that I was able to successfully publish to BinTray.

It can be imported by users through the experimental python_requires feature:

from conans import python_requires

CMakeConanFile = python_requires('autorecipes/[*]@jfreeman/testing').cmake()

class Recipe(CMakeConanFile):
name = CMakeConanFile.__dict__['name']
version = CMakeConanFile.__dict__['version']

The name and version attributes must be explicitly copied from the parent class because, at the moment, Conan parses the recipe file (instead of evaluating it) to make sure they are explicitly defined. I hope to get that changed, so that recipe classes can inherit any attribute.

The implementation uses Python descriptors to lazily load all but the cpp_info ingredients on-demand from their respective sources.

I'm choosing to share it early to solicit feedback on the design and implementation from more knowledgable experts. It is not yet ready for production use. If you think this is promising, or if you can help answer some of my open questions, please reach out! You can leave a comment here, open an issue on GitHub, or message me on Twitter.

Footnotes

  1. When Conan packages a project, it copies the recipe and only the recipe to a working directory and executes it there. It won't be able to find any modules or files from the source tree. There is an experimental technique to import a Python package or module from another Conan package, but it is not easy. ↩︎

  2. The module is allowed to define multiple classes, and there are no requirements on their names. Conan looks at every class and hopes to find exactly one that subclasses ConanFile. ↩︎

  3. It can be argued that build_type should be an option instead of a setting. A package can provide a reasonable default for itself (e.g. release), and it likely does not care which build type other packages use. However, build_type can decide the definition of NDEBUG, which is conventionally shared across different libraries, including the standard library. Because of this, it is not generally safe to link two packages with different build types. Without a Conan package for the standard library, its "options" must be folded into the settings. ↩︎

  4. Why do I keep harping on "best practices and common conventions"? Part of what keeps C++ behind other language communities in the package management department is the fact that we haven't grown up yet and standardized a project directory structure or a package specification. Without these standards, every project is a snowflake. With them, we can develop general tools that work with every project. I want to have these tools so that developers like me can focus on writing software instead of fighting with build systems and package managers, and to make C++ development less daunting to newcomers. I hope that one day I'll be able to replace "follows best practices and common conventions" with "conforms to the standard". ↩︎

  5. I have published a function for helping users correctly set the custom CMAKE_PROJECT_* variables. ↩︎

  6. Perhaps INTERFACE_COMPILE_OPTIONS should be added to cpp_info.cflags instead of cpp_info.cxxflags, perhaps it should be added to both, or perhaps it depends on the value of IMPORTED_LINK_INTERFACE_LANGUAGES. This is still an open question. ↩︎

  7. Similarly, perhaps INTERFACE_LINK_OPTIONS should be added to cpp_info.sharedlinkflags too. Another open question. ↩︎