Rebar proposal

by Raph Levien
30 Dec 1999

The Gnu build environment, a combination of make, autoconf, automake, and libtool works reasonably well, but has some shortcomings. I believe it is time for a better build tool.

This document contains a proposal for rebar, a new build tool. I think this is an important project, but I've got a lot of other projects, so it's going to go slowly unless somebody picks it up and runs with it. I'd be very happy to do the handoff.

Rebar is a total replacement for the tools listed above. Briefly, your project contains a rebar file with roughly the same information as configure.in and Makefile.am in the Gnu environment, but more declarative (less procedural). The rebar tool reads this file, adds its own system-specific information, and builds the software.

An additional goal of rebar is that it keeps track of library and other dependencies. Currently, there is no standard tool for this. The Gnome community has come up with an ad-hoc solution, gnome-config, but this tool has serious limitations. In particular, maintaining different versions remains tricky.

The rebar file

The rebar file specifies declaratively how to build the targets. Thus, the rebar file has absolutely minimal dependencies on the environment. I propose an XML format for the rebar file to emphasize the fact that it can be read and written by many different tools.

The fundamental stanza in a rebar file is a description of a target file with a corresponding list of inputs. For example, a library target is specified along with a list of .c files. Each .c file is tagged as a C file. This is enough information for rebar to know to invoke the C compiler to produce the .o file.

Another type of stanza is the generation of config files. What I have in mind here is that you specify a .c file roughly like the following:

#include <stdio.h>

int
main (int argc, char **argv)
{
   printf ("#define REBAR_INT_SIZE %d\n", sizeof(int));
   return 0;
}

The rebar file specifies that the running this program produces a .h file, which is a dependency of the program. Of course, this kills cross-compiling (I need to worry about this).

Finding libraries

Another important part of rebar is the creation of a namespace for libraries. Thus, the rebar file specifies that there is a dependency on the library "zlib", and rebar takes care of finding the include files and linking with the appropriate linker options.

This namespace should also be aware of version numbers, so that it's possible to maintain multiple versions of a library on the same system.

This functionality requires that rebar maintains a database of some kind mapping the name to specific config information for the library. For libraries that are generated with rebar, it shouldn't be that hard to enter the info into the database as part of the build/install process, but there also needs to be some kind of "alien" process for identifying libraries built outside the rebar domain.

Install/packaging

I think rebar should be able to make .deb's and .rpm's directly. In addition, rebar should be able to install libs and apps itself (ie, the analogous feature to "make install"). The overlaps between rebar and tools like rpm need to be investigated thoroughly.

Dependencies

A trivial implementation of rebar should be able to just go through the rebar file and build everything. However, in the modern development environment, it's important to build incrementally based on what's changed. To do this really well requires some fun techniques.

The goal for rebar is to always provide the same answer as a total rebuild, but in the common case skipping most duplicated work. This is, I feel, one of the serious disadvantages of the Gnu build environment: it's just too easy to get in a state where the incremental build is broken, even though the full rebuild works (or vice versa).

First, rebar manages a database of partially built targets. Each target contains an MD5 checksum of all the inputs to that target (ie, the source file, options, and dependencies). When building a rebar file, rebar constructs the appropriate MD5 checksum, then checks to see whether it already has it in the database. If not, it rebuilds.

It's probably best to think of all the .o files as a cache. For example, it might be desirable to build all three of -O2, -g, and -pg. Keeping all three sets around may make incremental building a lot faster. However, if you haven't built a -pg in ages, and you need the disk space, it's totally reasonable to delete them. The size of the cache should be user-configurable - if you're tight on disk space (as I perenially am on my laptop), then setting it small makes sense, but if you just bought a big new drive, set it big.

To simplify life for programmers, there should be implicit dependencies that require analysis of the source code to determine, in particular, the dependency on .h files that comes from #include directives. The Gnu tools handle these reasonably well (make deps), and this functionality should be preserved.

Finally, because rebar knows all the dependencies, it should be able to build with massive parallelism. This includes across subdirectories, as well, a current failing of make (make waits for all child processes to terminate before leaving a subdirectory).

Build contexts

Rebar should handily support multiple build contexts. The example of -O2, -g, and -pg builds has already been given. There are many other cases in which this capability is useful, including cross compiling, using profiling and debugging tools, and so on.

Extensibility

Because rebar requires language-specific knowledge (for example, how to invoke the C compiler for a .c file), it needs to be extensible to handle all possible projects, including multilanguage ones. Of course, extensibility means interoperability problems, but c'est la vie.

Free software

* www.levien.com