diff --git a/src/css/style.css b/src/css/style.css index b9fef16..2851750 100644 --- a/src/css/style.css +++ b/src/css/style.css @@ -103,8 +103,8 @@ figcaption { line-height: 110%; } -img[src],iframe,code { - border-radius: 0.25em; +img,iframe,code { + border-radius: 0.4em; } figure img { @@ -112,6 +112,7 @@ figure img { width: 100% } + form { border: dashed black; } diff --git a/src/writings/1-build-systems.md b/src/writings/1-build-systems.md index ad4ff5e..77f1306 100644 --- a/src/writings/1-build-systems.md +++ b/src/writings/1-build-systems.md @@ -1,36 +1,47 @@ # on build systems Recently I have been thinking about what makes for good build system. I want to analyze the major pain points I have encountered building software, -and identify where these systems go wrong. To do this I have picked several -languages I am already familiar with to use as case studies -## definitions +and identify where these systems go wrong. Seeing as C is one of the most +frustrating languages to build for in my experience, I will use the language +as a case study +## definition I think of build systems as a very broad category of software; the goal of -which is to automate the process of packaging or executing other software. -This typically involves several subtasks. Resolving dependencies, compiling, -interpreting, linking, deploying, packaging, executing - software that does one -or more of these things counts as a build system in my book +which is to automate the process of building other software. +This typically involves several tasks, mainly: +* linting +* formatting +* interpreting +* compiling +* linking +* packaging +* deploying +* executing + +Software that performs one or more of these tasks is a build tool. Any number of +build tools can make up a build system ## c and c++ The C family of languages has a quite complicated ecosystem of competing build systems. To start with, there are the compilers themselves: [GCC](https://gcc.gnu.org/) and [Clang](https://clang.llvm.org/). A typical -invocation of looks like this: +invocation of either looks like this: ```bash cc main.c foo.c bar.h -Iinclude/ -Llib/ -O2 -oProgram ``` -This is quite verbose as far as build commands go. The path of every source file -must be specified, as well as separate folders for library headers and object -files. Most software also makes use of numerous compiler flags, most of which -have incredibly cryptic names. +This is quite verbose as far as CLI tools go. The path of every source file +must be specified, as well as the location of libraries to be linked. Path +variables also play a role in the linking process, adding a layer of hidden +complexity Most software also makes use of numerous compiler flags, all of +which must be typed every time. To compile a C project using only the compiler requires first learning its structure, wrangling each of its dependencies manually, and reading -documentation to find the appropriate build flags for your platform. For any -non-trivial program, simply typing the build command becomes a challenge +documentation to find the appropriate build flags for your platform. This +is not a reasonable ask for any developer ## make -The problems introduced by the C family of compilers have proven intractable, -and so another layer of abstraction is necessary. Makefile is a rudimentary -scripting language primarily used to build C family languages. +Compiler developers have decided that these problems are out of scope, and +so another layer of abstraction is necessary. Make is a rudimentary scripting +language primarily used to build C projects. ```make # Example Makefile taken from: # https://www.cs.colby.edu/maxwell/courses/tutorials/maketutor/ @@ -53,36 +64,36 @@ program. As software becomes more complex, so too does the task of building it. The limitations of C make this problem particularly egregious, given its fragile -dependency resolution and lack of meta-programming. Makefiles are an attempt to -bridge this gap, and are a Turing-Complete language in their own right. +dependency resolution and lack of meta-programming. Make is an attempt to +bridge this gap, and is a Turing-Complete language in its own right. The Makefile which [builds the Linux kernel](https://github.com/torvalds/linux/blob/master/Makefile) is over 2000 lines as of writing. The massive demands placed on this intermediary language have exposed its weak points, mainly that it is stringly-typed and full of cryptic, unintuitive syntax. Maintaining complex Makefiles contributes to the difficulty of building software almost as much as -it reduces +it helps ## cmake -Just as Makefiles abstract away the complexity of compiling C, CMake abstracts -away the complexity of creating Makefiles. CMake is a great example of what -happens to software development when there are no adults in the room, so to -speak. Compiling a C program should be a simple task, ideally one that requires -nothing more than a C compiler. Failing that, a simple build scripting language -should be more than enough to handle even industrial use cases. When our build -system needs a build system, we have completely lost the plot and need to -reevaluate the problem from square one +Just as Make acts as an abstraction over C compilers, CMake acts as an +abstraction over Makefiles. CMake is a great example of what happens to software +development when there are no adults in the room, so to speak. Compiling a C +program should be a simple task, ideally one that requires nothing more than a +C compiler. Failing that, a simple build scripting language should be more than +enough to handle even industrial use cases. When our build system needs a build +system, we have completely lost the plot and need to reevaluate the problem from +square one ``` CMake -> Makefile -> gcc/clang -> Assembly ``` ## compile targets -Imagine a world where the Makefile language was more expressive, functional, and +Imagine a world where the Make language was more expressive, functional, and well-thought-out. Suddenly the idea of CMake becomes silly; clearly introducing another language into the mix would only slow down development and introduce an -entirely new category of bugs. CMake can only exist because Makefile failed to +entirely new category of bugs. CMake can only exist because Make failed to accomplish its goal. The same could be said for the GCC and Clang compilation syntax. Rather than fix the underlying issue, we treat the failed product as a -new compile target and build a new thing to abstract away (never replace!) the +new compilation target and build a new thing to abstract away (never replace!) the old thing. Developers are not (generally) stupid; this pattern exists for a reason. In the @@ -90,8 +101,8 @@ case of C, it is sometimes necessary to execute arbitrary code at build-time. The obvious solution is to create a new language to handle this need - but why is the original language not sufficient? Make is written in C, so by definition C can do anything Make can do. The issue is that C source files do not -contain enough information for the compiler to build the entire package. This -information must be embedded in another, nonstandard format, which itself must +contain enough information for the compiler to build the entire program. This +information must be embedded in another nonstandard format, which itself must be parsed and executed by a nonstandard build tool ## shebang @@ -118,8 +129,8 @@ describes how to run itself to the shell which invokes it. Since the `#` character is used as a comment in Python, the line can be safely ignored by any other programs or tools that read the file. This system is not perfect, the name or path of the python executable may vary between systems, and the shebang -relies on a shell to interpret it (technically a build system). Expanding on -this concept may help alleviate our build system woes. +relies on a shell to interpret it (technically a build system). However, +expanding on this concept may help alleviate our build system woes ## doing better Let's look at how C syntax could be changed to adopt some of these ideals, @@ -161,7 +172,7 @@ int main(int argc, char* argv[]) { if (strcmp("gcc", compiler.name) != 0 || compiler.semver.major <= 12) { // Abort build with error - emit_error("Incompatible compiler version!\n"); + emit_error("Incompatible compiler version!"); return 1; } @@ -189,27 +200,83 @@ The `compile.h` and `link.h` includes are compiler implemented, and so do not need to be linked from the system's libc. All flags passed to the compiler are handed off to the `main()` function. It is easy to imagine an equivalent to `make clean` that erases all build artifacts, or a caching system -that only rebuilds modified files. +that only rebuilds modified files ## going further I am not a C developer by any stretch, and so I will spare you any more pseudocode. I hope these examples show that replacing Makefiles with pure C is not such an unreasonable idea. Still, we can go even further; imagine -if we split the `compile()` function into lexing, parsing, and IR (llvm/gcc -bytecode) conversion functions. This would make meta-programming simple and +if we split the `compile()` function into lexing, parsing, and IR generating +intermediary functions. This would make meta-programming simple and straightforward, and even allow for the introduction of program-specific syntax. -Developers could create libraries for common build tasks such as cloning dependency -git repositories, running tests, or submitting binaries to package managers. +Developers could create libraries for common build tasks such as cloning git +repositories, running tests, or submitting binaries to package managers -## disadvantages +## setbacks Comparing my pseudocode to the Makefile example, it is obvious which is more -idiomatic and understandable. This is partially due to my lack of creativity -and skill as a C developer. However, I imagine Make will always have an advantage -here, at least when it comes to small projects. Even worse, on closer examination -we have not yet achieved the goal of in-lining build information. While our build -system is now in C, and does not rely on external tools, it is still a separate -entity from the project itself. The minimum possible C program is now two files -rather than just one. So far I have tried to conform as closely as possible to -standard C syntax and grammar, but this approach will always feel more like -a hack than a well-thought-out language feature +idiomatic and understandable. This is partially due to my lack of creativity and +skill as a C developer. However, I imagine Make will always have an advantage +here, at least when it comes to small projects. While our build system is +now in C, it is still a separate entity from the project itself. The minimum +possible C program is now two files rather than just one. So far I have tried +to conform as closely as possible to standard C syntax and grammar, but this +approach will always feel like a hack more than a well-thought-out language +feature +## the ouroboros +Most languages draw a very strong distinction between compile-time and run-time +code. Typically, compile-time execution may happen only within macros or +constant functions, if it is even allowed at all. This habit can be traced back +to assembly programmers who deemed self-modifying code a dangerous antipattern. +This mindset is what I believe drives us to create these leaning towers of build +systems. + +What would a language built around meta-programming look like? I suspect that +a language with a truly infinite degree of self reflection is possible. Such a +language could be far more expressive than its peers using less syntax. Imagine +if a library could implement a new language-wide keyword, and even implement +that keyword using the same keyword. Perhaps concepts as basic as structs, +enumerated types, and integers could be defined within the language itself. The +line between compilation and execution disappears. The line between language +and program grows thin. I imagine this process as if the language were eating +itself, like an ouroboros. + +
+ + + +
+ +If such a language existed, it follows that every other language would simply +be a strict subset of this language (lets call it "every-lang"). For example, +we could write an every-lang library which implements every piece of +[Lua](https://www.lua.org/) syntax and grammar on a meta-program level. A user +that imports this library could then simply write code in Lua, and compile the +program using the every-lang compiler. This library would effectively be a Lua +build system, that is also an every-lang build system, that is also an "every +language" build system + +```lua +-- A Lua program? or an every-lang program? +require "everylang" + +function fact(n) + if n == 0 then + return 1 + else + return n * fact(n-1) + end +end + +print(fact(5)) +``` + +## on crashing and burning +This every-lang is, to put it lightly, a little far-fetched. Such a language +would be nearly impossible to implement or reason about. A practically useful +language must make compromises with its users, and the fundamental laws of +computation. I believe that the next frontier for language design will be +pushing the boundary on this front - how close can we get to every-lang without +crashing and burning?