Understanding the suite of copy functions in bazel-lib

2025 June 28

I finally got around learning Bazel this year, and there is a very popular library of helper functions called bazel-lib. It contains multiple functions for copying files and directories with subtle differences between them that can be hard to tease apart at a glance, especially for a newcomer. Here, I present the mental model I constructed after studying them.

DirectoryPathInfo

First, we need to understand the distinction between the built-in DefaultInfo provider and the DirectoryPathInfo provider defined by bazel-lib and used in some (but not all!) of the helper functions.

DefaultInfo has one primary attribute, files, which holds a depset of File objects. Files are like paths, but get special treatment from Bazel. In particular, actions get easy access to them through ctx.file and ctx.files, short-cutting manual access through the DefaultInfo provider, e.g. by ctx.attr.my_file[DefaultInfo].files.to_list()[0].

Actions get handles to File objects in only two ways:

Thus, every File is a declared output of some action.

DirectoryPathInfo is similar, in that it represents a path, but different because (a) it is a provider, and (b) it is constructed in just one way: by combining a File that represents a directory (technically called a "tree artifact") and a string relative path. bazel-lib uses DirectoryPathInfo to refer to files and directories created by other actions that may not have been declared as outputs.

That is, when an action creates output files but only declares the output directory they were created in, DirectoryPathInfo lets us refer to those files that we know are there, without creating a File object. If we wanted a File object, we would have to create it by copying the file to a new output. DirectoryPathInfo lets us refer to specific files without copying them.

Rules

The rest of these functions are rules that return either:

  • a DefaultInfo provider with one or more File objects, or
  • a DirectoryPathInfo provider pointing to one path under a directory File object

In this context, it is important to know that sources files and directories are implicitly targets providing a DefaultInfo with exactly one File object representing that file or directory.

directory_path

directory_path returns a DirectoryPathInfo constructed from a target providing a DefaultInfo with exactly one directory, and a string relative path.

output_files

output_files returns a DefaultInfo with a set of files selected from another target's DefaultInfo provider, or from one group in its OutputGroupInfo provider. Files are selected by matching their "short path". All of the files must exist.

copy_file

copy_file returns a DefaultInfo with exactly one file copied from a target providing either (a) a DirectoryPathInfo pointing to a file or (b) a DefaultInfo with exactly one file (not a directory).

copy_directory

copy_directory returns a DefaultInfo with exactly one directory copied from a target providing the same. In this way, it is like copy_file except that it does not accept a DirectoryPathInfo. I do not know why, and this seems like a gap in capabilities.

copy_to_directory

copy_to_directory returns a DefaultInfo with exactly one directory, but it is so complicated that I never finished figuring out how it works. I know that, unlike copy_directory, it accepts DirectoryPathInfo targets in addition to DefaultInfo targets, but I ended up writing my own simpler rule for copying selections of files into a directory.