language ideas started
This commit is contained in:
parent
618f7a73ce
commit
bf0d65f176
|
@ -1,3 +1,8 @@
|
|||
@font-face {
|
||||
font-family: 'outfit';
|
||||
src: url(/res/Outfit.woff);
|
||||
}
|
||||
|
||||
@font-face {
|
||||
font-family: 'atkinson';
|
||||
src: url(/res/atkinson-regular.woff2);
|
||||
|
@ -14,7 +19,7 @@
|
|||
}
|
||||
|
||||
body {
|
||||
font-family: 'merriweather';
|
||||
font-family: 'outfit';
|
||||
background-image: url("/res/grey.png");
|
||||
background-repeat: repeat;
|
||||
height: 100%;
|
||||
|
@ -28,8 +33,8 @@ main {
|
|||
background-color: #ffeedd;
|
||||
margin: auto;
|
||||
max-width: 40em;
|
||||
line-height: 1.8em;
|
||||
font-size: 1.2em;
|
||||
line-height: 1.4em;
|
||||
font-size: 1.4em;
|
||||
}
|
||||
|
||||
p {
|
||||
|
@ -45,7 +50,9 @@ code {
|
|||
background: #1e1e22;
|
||||
color: #dddde1;
|
||||
padding: 0.1em;
|
||||
margin: 0.1em 0em 0.1em;
|
||||
padding-left: 0.3em;
|
||||
padding-right: 0.3em;
|
||||
border-radius: 0.4em;
|
||||
}
|
||||
|
||||
table {
|
||||
|
@ -104,7 +111,7 @@ figcaption {
|
|||
line-height: 110%;
|
||||
}
|
||||
|
||||
img,iframe,code {
|
||||
img,iframe {
|
||||
border-radius: 0.4em;
|
||||
}
|
||||
|
||||
|
|
BIN
src/res/Outfit.woff
Normal file
BIN
src/res/Outfit.woff
Normal file
Binary file not shown.
374
src/writings/2-language-ideas.md
Normal file
374
src/writings/2-language-ideas.md
Normal file
|
@ -0,0 +1,374 @@
|
|||
# language ideas
|
||||
Programming languages are something I think about a lot. This
|
||||
is a list of design choices I think are interesting, and would
|
||||
like to implement some day. Treat this page as documentation for
|
||||
a hypothetical language. Ideas are stolen liberally from
|
||||
[zig](https://ziglang.org/), [odin](https://odin-lang.org/), jai,
|
||||
[go](https://go.dev/), and [rust](https://www.rust-lang.org/)
|
||||
|
||||
## guiding principles
|
||||
A good language is comprehensive but not complex. It should provide a
|
||||
reasonable implementation for every common algorithm and data type, as well
|
||||
as many uncommon ones. Its use of concepts and keywords should be economical;
|
||||
each should perform a unique function not possible in their absence. Language
|
||||
features should closely map on to the real world mathematical and theoretical
|
||||
ideas they exist downstream from. It should be fast, simple, and succinct
|
||||
without compromise
|
||||
|
||||
## primitives
|
||||
Algebraic primitives are written as a single character followed by a data
|
||||
size. They include wholes (unsigned), integers (signed), real (floating point),
|
||||
complex, and quaternion types. The components of complex and quaternion numbers
|
||||
are reals. Primitive data sizes may be 8, 16, 32, 64, 128, or platform defined.
|
||||
Reals must have a data size greater than 8. Algebraic primitives may optionally
|
||||
specify they are big or little endian by appending `be` or `le` respectively.
|
||||
|
||||
The lexical primitives are glyphs, strings, and substrings. A glyph represents a
|
||||
single Unicode code-point, and is 32 bits wide. A string is an owned buffer
|
||||
of UTF-8 encoded text. Substrings are "fat pointers" to string buffers, with
|
||||
a specified
|
||||
length.
|
||||
|
||||
The logical primitives are booleans and bitflags. A boolean is 8 bits wide, and
|
||||
can be true or false. Bitflags are a compressed array of booleans, and may be
|
||||
8, 16, 32, 64, or 128 bits wide.
|
||||
|
||||
The two special types are pointers and the empty type. The pointer type is wide enough to
|
||||
fit a native pointer, and is intentionally distinct from `wsize` in order to support
|
||||
the [CHERI](https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/) architecture
|
||||
(see [this article](https://tratt.net/laurie/blog/2022/making_rust_a_better_fit_for_cheri_and_other_platforms.html)).
|
||||
The empty type exists only implicitly, and has a size of zero. It is primarily used in
|
||||
combination with fallible types to represent the absence of a value. No variable or immutable
|
||||
can be of the empty type. It is represented as an empty block `{}`
|
||||
|
||||
```yaml
|
||||
primitive:
|
||||
algebraic:
|
||||
whole: w8 w16 w32 w64 w128 wsize
|
||||
integer: i8 i16 i32 i64 i128 isize
|
||||
real: r16 r32 r64 r128
|
||||
complex: c32 c64 c128 c256
|
||||
quaternion: q64 q128 q256 q512
|
||||
lexical: glyph string substr
|
||||
logical: bool b8 b16 b32 b64 b128
|
||||
special: ptr {}
|
||||
```
|
||||
|
||||
## operators
|
||||
```yaml
|
||||
math: + - / * %
|
||||
logical: not and or xor
|
||||
bitwise: ~ & | ^ << >>
|
||||
comparative: < <= > >= == !=
|
||||
```
|
||||
|
||||
## variables
|
||||
Variables and constants are strongly typed, but the type may be inferred or
|
||||
explicit. Re-declared variables shadow the previous declaration for the current
|
||||
scope. Declaration is done using `:=` and `::`. Uninitialized variables are zeroed
|
||||
by default. Multiple declaration and assignment is allowed
|
||||
```rust
|
||||
// Implicit type
|
||||
v1 := 10;
|
||||
c1 :: true;
|
||||
// Explicit type
|
||||
v2 : i64 = 20;
|
||||
c2 : bool : false;
|
||||
|
||||
v1 = 15; // Assignment
|
||||
v3, v4 := 'π', 3.14159; // Multiple declaration
|
||||
v3, v4 = 'τ', 6.28318; // Multiple assignment
|
||||
{
|
||||
v2 := "Hello World"; // Name shadowing
|
||||
v4, v5 := 4.0, q128(1.0, 0.0, 0.0, 0.0);
|
||||
}
|
||||
// v2 == 20, v4 == 6.28318
|
||||
```
|
||||
|
||||
## conditionals
|
||||
There are two kinds of conditionals, if-else statements and when statements. The
|
||||
when statement is similar to the switch statement in other languages. They must
|
||||
exhaustively cover every case, or include a default case. Both can be used
|
||||
as the right-hand side of an assignment
|
||||
```rust
|
||||
n := 15;
|
||||
sign := if n < 0 {
|
||||
print("n is negative");
|
||||
-1
|
||||
} else {
|
||||
print("n is positive");
|
||||
1
|
||||
}
|
||||
|
||||
s := ":)"
|
||||
match s {
|
||||
":(" -> {
|
||||
println("Sad face");
|
||||
},
|
||||
":)" -> {
|
||||
println("Happy face");
|
||||
},
|
||||
":|" -> {
|
||||
println("Neutral face");
|
||||
},
|
||||
default(val) -> {
|
||||
println("I dont recognize this face:");
|
||||
println(val);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## blocks
|
||||
Blocks are surrounded by `{}` and optionally evaluate to some type. A block
|
||||
may be given a name using the `@` operator. By default, blocks will capture all
|
||||
variables within the scope they are declared. A block can optionally capture
|
||||
only specific named variables from the scope in which it is evaluated using the
|
||||
capture syntax `[]`.
|
||||
|
||||
The `break` keyword is used to escape a block early. A block name can be
|
||||
provided to break out of multiple nested blocks at once. If the escaped block
|
||||
evaluates to a type, a value must be provided to break. If the final expression
|
||||
within a block does not end in a semicolon, the block will evaluate to that
|
||||
expression.
|
||||
```rust
|
||||
sum := 0;
|
||||
for x : 0..100 @x_block {
|
||||
for y : 0..100 @y_block {
|
||||
for z : 0..100 @z_block {
|
||||
if x == 16 {
|
||||
break @x_block;
|
||||
}
|
||||
else if y == 64 and z == 32 {
|
||||
break @y_block;
|
||||
}
|
||||
sum += x + y + z;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
outside := "foo";
|
||||
sum2 := [^sum] { // Captures mutable reference to `sum`
|
||||
/* outside = "bar"; */ // ERROR: only `sum` is captured
|
||||
sum *= 2
|
||||
} // == sum * 2
|
||||
```
|
||||
|
||||
## functions
|
||||
Functions and lambdas are not distinct types, and do not have distinct syntax.
|
||||
A function may be declared anywhere a variable may, including inside other
|
||||
functions or as a parameter. A function may have any number of ordered
|
||||
parameters, and may be variadic. Functions may have default parameters,
|
||||
and default parameters may refer to parameters which come before them.
|
||||
Multiple values may be returned from a function
|
||||
|
||||
A function may accept any number of receiver types. A receiver types
|
||||
implements the function as a "method". Multiple receiver parameters refer to
|
||||
a parenthetical list of objects of the given types, which are order dependent.
|
||||
Types can use methods defined from any receiver function within the current
|
||||
scope. Receiver types can be any type, including primitives.
|
||||
|
||||
Function overloading is valid so long as no two functions share both a name
|
||||
and call signature. Function names can also be shadowed just like variables.
|
||||
|
||||
A function body is a block. This means it can optionally capture local variables
|
||||
from where it is called. Unlike regular blocks, function blocks do not capture
|
||||
locals automatically. Functions implicitly return the last expression of their
|
||||
body if it does not end in a semicolon, and can be returned early by using
|
||||
`break` without a block name. Note that block captures and names are not part of
|
||||
a functions signature
|
||||
```rust
|
||||
// Receiver types come before parameters, and may be elided
|
||||
fizzbuzz :: (number: wsize) -> wsize {
|
||||
sum := 0;
|
||||
if number == 0 {
|
||||
println("Input cannot be zero");
|
||||
break sum; // Early return
|
||||
}
|
||||
for i : 0..number {
|
||||
if (i % 3 == 0) and (i % 5 == 0) {
|
||||
println("fizzbuzz");
|
||||
sum += 1;
|
||||
}
|
||||
else if i % 3 == 0 {
|
||||
println("fizz");
|
||||
}
|
||||
else if i % 5 == 0 {
|
||||
println("buzz");
|
||||
}
|
||||
}
|
||||
sum // return the number of fizzbuzz's
|
||||
}
|
||||
f1 := fizzbuzz(15);
|
||||
f2 := fizzbuzz(number: 30); // Parameters may be explicitly named
|
||||
|
||||
// With a single receiver
|
||||
fizzbuzz :: (number: wsize)() -> wsize { fizzbuzz(number) }
|
||||
f3 := 15.fizzbuzz()
|
||||
|
||||
// With multiple receivers
|
||||
fizzbuzz :: (number1: wsize, number2: wsize)() -> wsize, wsize {
|
||||
number1.fizzbuzz(), number2.fizzbuzz()
|
||||
}
|
||||
f4, f5 := (15, 30).fizzbuzz();
|
||||
|
||||
// With capture
|
||||
sum_locals :: () [f1: wsize, f2: wsize, f3: wsize, f4: wsize] {
|
||||
f1 + f2 + f3 + f4
|
||||
}
|
||||
|
||||
f6 := sum_locals(); // Captures local variables implicitly
|
||||
```
|
||||
|
||||
## fallibility
|
||||
A type which is followed by a question mark `?` is marked as fallible. A fallible
|
||||
type may or may not contain a value, and must be checked before they are accessed.
|
||||
All types implicitly convert to their fallible counterpart
|
||||
```rust
|
||||
something : w64? = 5;
|
||||
nothing : w64? = {};
|
||||
|
||||
// Conditional assignment
|
||||
if num := something {
|
||||
print("Found a value");
|
||||
} else {
|
||||
print("No value exists");
|
||||
}
|
||||
|
||||
// Most boolean operations work on fallible types
|
||||
maybe = something or nothing;
|
||||
```
|
||||
|
||||
Functions may return a fallible type. Placing a question mark after
|
||||
a fallible value will immediately return from the current
|
||||
function if that value is empty
|
||||
```rust
|
||||
divide :: (numerator: r64, divisor: r64) -> r64? {
|
||||
if numerator == 0.0 {
|
||||
break; // Failure case, returns the empty type
|
||||
}
|
||||
numerator / divisor
|
||||
}
|
||||
|
||||
inverse_squared :: (val: r64) -> r64? {
|
||||
inv : r64 = divide(1.0, val)?;
|
||||
inv * inv
|
||||
}
|
||||
```
|
||||
|
||||
## generators
|
||||
Capturing variables within a function allows for generator functions.
|
||||
A generator function can produce different values each time it is called.
|
||||
They are often used to lazily evaluate an open-ended sequence, or to iterate
|
||||
over a collection. The `:` operator inside a for loop expression will call
|
||||
a function repeatedly until the function fails to return a value
|
||||
```rust
|
||||
range (minimum: i64, maximum: i64) -> (() -> i64?) {
|
||||
current := minimum;
|
||||
() [minimum, maximum, current] {
|
||||
last := current;
|
||||
current += 1;
|
||||
if current <= maximum {
|
||||
break last;
|
||||
} else {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
iterable := range(0, 10);
|
||||
for i : iterable {
|
||||
print(i);
|
||||
} // Prints 0-9
|
||||
|
||||
iterable := 0..10; // Equivalant to range(0, 10)
|
||||
```
|
||||
|
||||
## move, clone, refer
|
||||
Be default, parameters and receivers are "moved" into the function body. This
|
||||
means that they can no longer be accessed by the surrounding scope. A variable
|
||||
may be cloned instead to replicate the classic C "pass by value" behavior. The
|
||||
`clone` method is implemented on all primitives and may be implemented on any
|
||||
user defined type.
|
||||
|
||||
Referenced types are prefixed by `&` or `^`, for immutable and mutable
|
||||
references respectively. Cloning a reference results in a clone of the
|
||||
underlying value. Receiver arguments can be either type of reference. Values
|
||||
will implicitly cast to their referenced equivalent for the purpose of method
|
||||
calls, but not vice-versa
|
||||
```rust
|
||||
value : wsize = 10;
|
||||
reference : &wsize = &2;
|
||||
|
||||
divide :: (numerator: wsize)(divisor: &wsize) {
|
||||
numerator / divisor
|
||||
}
|
||||
|
||||
num1 := value.divide(reference); // `value` moves out of scope
|
||||
num2 := num1.clone().divide(reference); // `num1` stays in scope
|
||||
num3 := reference.clone().divide(&num1);
|
||||
```
|
||||
## aliased types
|
||||
A type can be aliased, giving it a new name. The alias is treated as a
|
||||
completely different type, and will not implicitly cast to the original type
|
||||
```rust
|
||||
my_string_alias :: string;
|
||||
normal_string : string = "Hello World";
|
||||
aliased_string : my_string_alias = "Goodbye World";
|
||||
/* normal_string = aliased_string; */ // ERROR: incompatible types
|
||||
```
|
||||
|
||||
## structures
|
||||
A structure is defined by aliasing the `struct` type. They may contain named or
|
||||
unnamed fields, but not a mix of both. Like local variables, uninitialized struct
|
||||
fields are zeroed by default
|
||||
```rust
|
||||
// Struct with named fields
|
||||
Position :: struct {
|
||||
x: r64, y: r64
|
||||
};
|
||||
|
||||
origin : Position = {x: 0.0, y: 0.0};
|
||||
// With unnamed fields
|
||||
Color :: struct {
|
||||
r64, r64, r64
|
||||
};
|
||||
red : Color = {0.0, 0.0, 0.0};
|
||||
// Accessing unnamed fields
|
||||
red.0 = 1.0;
|
||||
red.1 = 0.2;
|
||||
red.2 = 0.2;
|
||||
// red == {1.0, 0.2, 0.2};
|
||||
```
|
||||
|
||||
## defer
|
||||
Execution of a block can be deferred until the end of the current scope. Deferred
|
||||
blocks execute in the reverse order they were declared
|
||||
```rust
|
||||
{
|
||||
println("First");
|
||||
defer { println("Fourth"); }
|
||||
defer { println("Third"); }
|
||||
println("Second")
|
||||
}
|
||||
```
|
||||
|
||||
## namespaces
|
||||
All source files implicitly exist within their own namespace, which must be
|
||||
named at the top of the file. An inner namespace can also be declared. Members
|
||||
of a namespace are accessed using the dot `.` operator
|
||||
```rust
|
||||
ns1 :: {
|
||||
foo :: () {
|
||||
println("foo");
|
||||
}
|
||||
|
||||
ns2 :: {
|
||||
bar :: () {
|
||||
println("bar");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ns1.foo();
|
||||
ns1.ns2.bar();
|
||||
```
|
|
@ -1,2 +1,3 @@
|
|||
# writings
|
||||
* [language ideas](./2-language-ideas.html)
|
||||
* [on build systems](./1-build-systems.html)
|
||||
|
|
Loading…
Reference in a new issue