Understanding the suite of copy functions in bazel-lib
I finally got around learning Bazel this year, and there is a very popular library of helper functions called bazel-lib. It contains multiple functions for copying files and directories with subtle differences between them that can be hard to tease apart at a glance, especially for a newcomer. Here, I present the mental model I constructed after studying them.
DirectoryPathInfo
First, we need to understand the distinction between the built-in
DefaultInfo provider and the DirectoryPathInfo provider
defined by bazel-lib and used in some (but not all!) of the helper functions.
DefaultInfo has one primary attribute, files, which holds a depset
of File objects. Files are like paths, but get special treatment from
Bazel. In particular, actions get easy access to them through ctx.file
and ctx.files, short-cutting manual access through the DefaultInfo
provider, e.g. by ctx.attr.my_file[DefaultInfo].files.to_list()[0].
Actions get handles to File objects in only two ways:
- Declaring them as outputs with
ctx.actions.declare_fileandctx.actions.declare_directory. - Reading them as inputs by pulling them out of the providers of
parameter attributes, e.g. with
ctx.file. All of theseFiles were declared as outputs by other actions.
Thus, every File is a declared output of some action.
DirectoryPathInfo is similar, in that it represents a path,
but different because (a) it is a provider, and (b) it is constructed in
just one way: by combining a File that represents a directory
(technically called a "tree artifact") and a string relative path.
bazel-lib uses DirectoryPathInfo to refer to files and directories created
by other actions that may not have been declared as outputs.
That is, when an action creates output files but only declares the output
directory they were created in, DirectoryPathInfo lets us refer to those
files that we know are there, without creating a File object. If we wanted
a File object, we would have to create it by copying the file to a new
output. DirectoryPathInfo lets us refer to specific files without copying
them.
Rules
The rest of these functions are rules that return either:
- a
DefaultInfoprovider with one or moreFileobjects, or - a
DirectoryPathInfoprovider pointing to one path under a directoryFileobject
In this context, it is important to know that sources files and directories
are implicitly targets providing a DefaultInfo with exactly one File
object representing that file or directory.
directory_path
directory_path returns a DirectoryPathInfo constructed from a target
providing a DefaultInfo with exactly one directory, and a string relative path.
output_files
output_files returns a DefaultInfo with a set of files selected from
another target's DefaultInfo provider, or from one group in its
OutputGroupInfo provider. Files are selected by matching their
"short path". All of the files must exist.
copy_file
copy_file returns a DefaultInfo with exactly one file copied from
a target providing either (a) a DirectoryPathInfo pointing to a file or
(b) a DefaultInfo with exactly one file (not a directory).
copy_directory
copy_directory returns a DefaultInfo with exactly one directory copied
from a target providing the same. In this way, it is like copy_file except
that it does not accept a DirectoryPathInfo. I do not know why, and this
seems like a gap in capabilities.
copy_to_directory
copy_to_directory returns a DefaultInfo with exactly one directory,
but it is so complicated that I never finished figuring out how it works.
I know that, unlike copy_directory, it accepts DirectoryPathInfo targets
in addition to DefaultInfo targets, but I ended up writing my own simpler
rule for copying selections of files into a directory.