Motivation
HTML is a powerful tool for creating static web pages, but
is not easily modular or scalable. Front end frameworks like
React provide reusable components. I want to use components
without the overhead of a front-end framework. I want the final
output to be statically generated rather than generated on the fly
by the server or the client, minified and with inlined CSS and
JavaScript.
Approach
After looking at some of the HTML parsing libraries, I was
unsatisfied with the approaches most developers took. The DOM
is represented as a tree structure, where each node keeps a
pointer to its children.
<@code lang="rust">
// From the html_parser crate
pub enum Node {
Text(String),
Element(Element),
Comment(String),
}
pub struct Element {
pub name: String,
pub attributes: HashMap
>,
pub children: Vec,
// other fields omitted
}
@code>
This is bad for cache locality and effectively forces the use
of recursion in order to traverse the DOM. Recursion is undesirable
for performance and robustness reasons. My approach is to create a
flat array of elements.
<@code lang="rust">
// My implementation (fields omitted)
pub enum HtmlElement {
DocType,
Comment(/* */),
OpenTag { /* */ },
CloseTag { /* */ },
Text( /* */ ),
Script { /* */ },
Style { /* */ },
Directive { /* */ },
}
@code>
This approach lends itself to linear iterative algorithms. An
element's children can be represented as a slice of the DOM array,
from the opening tag to its close tag.
Templates
I define a "template" as a user defined element which expands into
a larger piece of HTML. Templates can pass on children and attributes
to the final output. Some special templates can perform other functions,
like inlining a CSS file or syntax highlighting a code block.
<@code lang="html">
<@children/>
<@children/>
@code>