work on build systems writup
This commit is contained in:
parent
486cbd5b9a
commit
6244a86a61
|
@ -23,6 +23,7 @@ body {
|
|||
max-width: 40em;
|
||||
display: flex;
|
||||
margin: auto;
|
||||
tab-size: 2;
|
||||
}
|
||||
|
||||
main {
|
||||
|
@ -35,13 +36,19 @@ main {
|
|||
font-size: 1.2em;
|
||||
}
|
||||
|
||||
p {
|
||||
text-align: justify;
|
||||
hyphens: auto;
|
||||
}
|
||||
|
||||
code {
|
||||
display: inline-block;
|
||||
font-family: cascadia;
|
||||
font-size: 0.9em;
|
||||
background: #1e1e22;
|
||||
color: #dddde1;
|
||||
padding: 0.1em 0.5em 0.1em;
|
||||
padding: 0.1em;
|
||||
margin: 0.1em 0em 0.1em;
|
||||
}
|
||||
|
||||
table {
|
||||
|
@ -83,7 +90,7 @@ section {
|
|||
text-align: center;
|
||||
}
|
||||
|
||||
section h2,p {
|
||||
section h2,section p {
|
||||
text-align: left;
|
||||
}
|
||||
|
||||
|
|
|
@ -45,25 +45,171 @@ OBJ=hellomake.o hellofunc.o
|
|||
hellomake: $(OBJ)
|
||||
$(CC) -o $@ $^ $(CFLAGS)
|
||||
```
|
||||
Makefiles attempt to abstract away the complexity of the C compilation process.
|
||||
Make attempt to abstract away the complexity of the C compilation process.
|
||||
Variables and pattern matching of file names are particularly well suited for
|
||||
managing compiler flags and object files. However, by far the most attractive
|
||||
feature of Makefiles is the ability to simply type `make` to compile the entire
|
||||
feature of Make is the ability to simply type `make` to compile the entire
|
||||
program.
|
||||
|
||||
As software becomes more complex, so too does the task of building it. The
|
||||
As software becomes more complex, so too does the task of building it. The
|
||||
limitations of C make this problem particularly egregious, given its fragile
|
||||
dependency resolution and lack of meta-programming. Makefiles have attempted
|
||||
to bridge this gap, and are a Turing-Complete language in their own right.
|
||||
The [Makefile which builds the Linux kernel](https://github.com/torvalds/linux/blob/master/Makefile)
|
||||
dependency resolution and lack of meta-programming. Makefiles are an attempt to
|
||||
bridge this gap, and are a Turing-Complete language in their own right.
|
||||
The Makefile which [builds the Linux kernel](https://github.com/torvalds/linux/blob/master/Makefile)
|
||||
is over 2000 lines as of writing. The massive demands placed on this
|
||||
intermediary language have exposed its weak points, mainly that it is
|
||||
stringly-typed and full of cryptic, unintuitive syntax. Maintaining complex
|
||||
stringly-typed and full of cryptic, unintuitive syntax. Maintaining complex
|
||||
Makefiles contributes to the difficulty of building software almost as much as
|
||||
it reduces
|
||||
|
||||
## cmake
|
||||
When I first learned about CMake and what it does, I actually laughed out loud.
|
||||
I am lucky enough to have never written a CMake file, so this section will
|
||||
be brief. Just as Makefiles abstract away the complexity of building C, CMake
|
||||
abstracts away the complexity of building Makefiles.
|
||||
Just as Makefiles abstract away the complexity of compiling C, CMake abstracts
|
||||
away the complexity of creating Makefiles. CMake is a great example of what
|
||||
happens to software development when there are no adults in the room, so to
|
||||
speak. Compiling a C program should be a simple task, ideally one that requires
|
||||
nothing more than a C compiler. Failing that, a simple build scripting language
|
||||
should be more than enough to handle even industrial use cases. When our build
|
||||
system needs a build system, we have completely lost the plot and need to
|
||||
reevaluate the problem from square one
|
||||
```
|
||||
CMake -> Makefile -> gcc/clang -> Assembly
|
||||
```
|
||||
|
||||
## compile targets
|
||||
Imagine a world where the Makefile language was more expressive, functional, and
|
||||
well-thought-out. Suddenly the idea of CMake becomes silly; clearly introducing
|
||||
another language into the mix would only slow down development and introduce an
|
||||
entirely new category of bugs. CMake can only exist because Makefile failed to
|
||||
accomplish its goal. The same could be said for the GCC and Clang compilation
|
||||
syntax. Rather than fix the underlying issue, we treat the failed product as a
|
||||
new compile target and build a new thing to abstract away (never replace!) the
|
||||
old thing.
|
||||
|
||||
Developers are not (generally) stupid; this pattern exists for a reason. In the
|
||||
case of C, it is sometimes necessary to execute arbitrary code at build-time.
|
||||
The obvious solution is to create a new language to handle this need - but why
|
||||
is the original language not sufficient? Make is written in C, so by definition
|
||||
C can do anything Make can do. The issue is that C source files do not
|
||||
contain enough information for the compiler to build the entire package. This
|
||||
information must be embedded in another, nonstandard format, which itself must
|
||||
be parsed and executed by a nonstandard build tool
|
||||
|
||||
## shebang
|
||||
Developers have become overly complacent with build systems. Look at any project
|
||||
today, and in the root directory you will see a layer of congealed fat:
|
||||
`package.json`, `CMakeLists.txt`, `Cargo.toml`, `build.gradle`, maybe a python
|
||||
virtual environment, along with any ignore files, linter configs, etc... Every
|
||||
new tool, language, and config file means another program to install and another
|
||||
step in the build process. Every one of these dependencies makes the project
|
||||
more fragile and less portable. Meanwhile, we are not making good use of the
|
||||
tools we already have. We ought to be demanding more from language designers.
|
||||
The build process should not be an afterthought left for developers to figure
|
||||
out, it should be a core consideration when designing grammar and syntax.
|
||||
|
||||
If you have ever used a scripting language, you are probably familiar with the
|
||||
shebang line.
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
print("Hello World!")
|
||||
```
|
||||
This wonderfully useful one-liner captures what I mean by making use of existing
|
||||
tools, and treating the build process as a grammar concern. This Python file
|
||||
describes how to run itself to the shell which invokes it. Since the `#`
|
||||
character is used as a comment in Python, the line can be safely ignored by
|
||||
any other programs or tools that read the file. This system is not perfect, the
|
||||
name or path of the python executable may vary between systems, and the shebang
|
||||
relies on a shell to interpret it (technically a build system). Expanding on
|
||||
this concept may help alleviate our build system woes.
|
||||
|
||||
## doing better
|
||||
Let's look at how C syntax could be changed to adopt some of these ideals,
|
||||
starting with a simple example:
|
||||
```c
|
||||
#!/usr/bin/gcc -E
|
||||
// Warn or error if specific compiler not used
|
||||
#compiler gcc 12.2.0
|
||||
#semver 1.0.0
|
||||
#ifdef RELEASE
|
||||
#opt o2
|
||||
#endif
|
||||
#warn all
|
||||
#libs lib/
|
||||
#output buid/MyProgram
|
||||
// Warn or error if library semver does not match
|
||||
#include "mylib.h" "0.2.*"
|
||||
|
||||
int main() {
|
||||
/* ... */
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
Here, I have replaced command-line flags with a special `#` prefixed syntax.
|
||||
Since all the compiler directives are in-line with the source code itself,
|
||||
we can take advantage of the shebang line just like scripting languages do.
|
||||
Using the `#ifdef` directive, we can even conditionally enable flags for
|
||||
release mode. Let's see what we can do with an even more radical approach:
|
||||
```c
|
||||
// build.c
|
||||
#include <compile.h>
|
||||
#include <link.h>
|
||||
|
||||
#define RELEASE 0
|
||||
|
||||
int main(int argc, char* argv[]) {
|
||||
// Struct representing the invoked compiler
|
||||
compiler_t compiler = get_compiler();
|
||||
if (strcmp("gcc", compiler.name) != 0
|
||||
|| compiler.semver.major <= 12) {
|
||||
// Abort build with error
|
||||
emit_error("Incompatible compiler version!\n");
|
||||
return 1;
|
||||
}
|
||||
|
||||
semver_t version = {.major=1, .minor=0, .patch=0};
|
||||
int opt_level;
|
||||
// We could easily check for a flag
|
||||
// in argv here instead
|
||||
if (RELEASE) {
|
||||
opt_level = 2;
|
||||
} else {
|
||||
opt_level = 0;
|
||||
}
|
||||
// A realistic function would probably take
|
||||
// some structure containing compile directives
|
||||
artifact_t executable = compile(
|
||||
&compiler, "main.c", opt_level, version
|
||||
);
|
||||
artifact_t mylib = load_dylib("lib/mylib.so");
|
||||
link(&executable, &mylib);
|
||||
write_artifact(&executable, "bin/MyProgram");
|
||||
}
|
||||
```
|
||||
In this example, we create a new file `build.c` which acts as a pseudo-Makefile.
|
||||
The `compile.h` and `link.h` includes are compiler implemented, and so do not
|
||||
need to be linked from the system's libc. All flags passed to the compiler are
|
||||
handed off to the `main()` function. It is easy to imagine an
|
||||
equivalent to `make clean` that erases all build artifacts, or a caching system
|
||||
that only rebuilds modified files.
|
||||
|
||||
## going further
|
||||
I am not a C developer by any stretch, and so I will spare you any more
|
||||
pseudocode. I hope these examples show that replacing Makefiles with pure
|
||||
C is not such an unreasonable idea. Still, we can go even further; imagine
|
||||
if we split the `compile()` function into lexing, parsing, and IR (llvm/gcc
|
||||
bytecode) conversion functions. This would make meta-programming simple and
|
||||
straightforward, and even allow for the introduction of program-specific syntax.
|
||||
Developers could create libraries for common build tasks such as cloning dependency
|
||||
git repositories, running tests, or submitting binaries to package managers.
|
||||
|
||||
## disadvantages
|
||||
Comparing my pseudocode to the Makefile example, it is obvious which is more
|
||||
idiomatic and understandable. This is partially due to my lack of creativity
|
||||
and skill as a C developer. However, I imagine Make will always have an advantage
|
||||
here, at least when it comes to small projects. Even worse, on closer examination
|
||||
we have not yet achieved the goal of in-lining build information. While our build
|
||||
system is now in C, and does not rely on external tools, it is still a separate
|
||||
entity from the project itself. The minimum possible C program is now two files
|
||||
rather than just one. So far I have tried to conform as closely as possible to
|
||||
standard C syntax and grammar, but this approach will always feel more like
|
||||
a hack than a well-thought-out language feature
|
||||
|
||||
|
|
Loading…
Reference in a new issue