What is a binary software package?

2019 June 5

In this post, I elaborate my simplified mental model for understanding binary software packages.

A software package is just a collection of libraries (i.e. bits of reusable code) and/or executables (i.e. programs), and any data (e.g. configuration files or documentation) necessary or helpful to their use.

For libraries and executables written in interpreted languages, the developer does not have to perform a separate compilation step:[1] source files can be executed directly. For those written in compiled languages, source files must be compiled before they can be used. Compilation produces code in a different language, code that can be executed directly by a machine. That machine may be virtual, like with Java or C#, which means the compiled bytecode is executed by other software, or that machine may be physical, like with C and C++, which means the compiled machine code is executed by hardware, i.e. a processor.

Machine code is generated according to an application binary interface (ABI), which controls many parameters:

A program compiled to one ABI will not run (correctly, at least) on a machine with another ABI because of two important factors influencing the ABI:

  • The architecture is the instruction set used by the processor. It differs between processor families. Intel and AMD processors use x86 (32-bit) and x86-64 (64-bit) architectures. Most mobile processors (like the one in your phone) use the ARM architecture. Among other responsibilities, the architecture determines the formats of integers (i.e. endianness) and floating point numbers.

  • The platform is the operating system, e.g. Windows, OSX, Linux, iOS, or Android. The platform includes the loader, which controls how programs and shared libraries are loaded and thus chooses their file format. Because standard libraries abstract over the details of system calls, e.g. for reading and writing files, they will have different implementations on different platforms.

Further, two object files with different ABIs cannot be safely linked[2] because of additional factors affecting the ABI:

  • The compiler may have some discretion in different aspects of the ABI. Compilers try to keep the same ABI from version to version, but changes are not unheard of. For C++, GCC and Clang both use the Itanium C++ ABI.

  • The standard library implementation defines types, constants, and inlined functions that change how your code using it is compiled.

  • In the presence of conditional compilation, compile-time options can affect the ABI.

A binary software package is a software package already compiled according to an ABI. Its executables can be safely run on machines with the same ABI, and its libraries can be safely linked with object files sharing the same ABI[3].

Footnotes

  1. Which is not to say that compilation does not happen. It often does, just-in-time, but it is not performed by the developer. ↩︎

  2. Previously, I had written that ABI-incompatible object files will not link, but that is not always the case. ABI incompatiblity can go undiscovered until it manifests as a run-time error. Thanks to Howard Hinnant for the correction! ↩︎

  3. Even if it is not the same ABI as the machine that is compiling and linking the code, as in cross-compilation. ↩︎