language ideas started

This commit is contained in:
Logan 2024-09-23 01:10:17 -05:00
parent 618f7a73ce
commit bf0d65f176
4 changed files with 387 additions and 5 deletions

View file

@ -1,3 +1,8 @@
@font-face {
font-family: 'outfit';
src: url(/res/Outfit.woff);
}
@font-face {
font-family: 'atkinson';
src: url(/res/atkinson-regular.woff2);
@ -14,7 +19,7 @@
}
body {
font-family: 'merriweather';
font-family: 'outfit';
background-image: url("/res/grey.png");
background-repeat: repeat;
height: 100%;
@ -28,8 +33,8 @@ main {
background-color: #ffeedd;
margin: auto;
max-width: 40em;
line-height: 1.8em;
font-size: 1.2em;
line-height: 1.4em;
font-size: 1.4em;
}
p {
@ -45,7 +50,9 @@ code {
background: #1e1e22;
color: #dddde1;
padding: 0.1em;
margin: 0.1em 0em 0.1em;
padding-left: 0.3em;
padding-right: 0.3em;
border-radius: 0.4em;
}
table {
@ -104,7 +111,7 @@ figcaption {
line-height: 110%;
}
img,iframe,code {
img,iframe {
border-radius: 0.4em;
}

BIN
src/res/Outfit.woff Normal file

Binary file not shown.

View file

@ -0,0 +1,374 @@
# language ideas
Programming languages are something I think about a lot. This
is a list of design choices I think are interesting, and would
like to implement some day. Treat this page as documentation for
a hypothetical language. Ideas are stolen liberally from
[zig](https://ziglang.org/), [odin](https://odin-lang.org/), jai,
[go](https://go.dev/), and [rust](https://www.rust-lang.org/)
## guiding principles
A good language is comprehensive but not complex. It should provide a
reasonable implementation for every common algorithm and data type, as well
as many uncommon ones. Its use of concepts and keywords should be economical;
each should perform a unique function not possible in their absence. Language
features should closely map on to the real world mathematical and theoretical
ideas they exist downstream from. It should be fast, simple, and succinct
without compromise
## primitives
Algebraic primitives are written as a single character followed by a data
size. They include wholes (unsigned), integers (signed), real (floating point),
complex, and quaternion types. The components of complex and quaternion numbers
are reals. Primitive data sizes may be 8, 16, 32, 64, 128, or platform defined.
Reals must have a data size greater than 8. Algebraic primitives may optionally
specify they are big or little endian by appending `be` or `le` respectively.
The lexical primitives are glyphs, strings, and substrings. A glyph represents a
single Unicode code-point, and is 32 bits wide. A string is an owned buffer
of UTF-8 encoded text. Substrings are "fat pointers" to string buffers, with
a specified
length.
The logical primitives are booleans and bitflags. A boolean is 8 bits wide, and
can be true or false. Bitflags are a compressed array of booleans, and may be
8, 16, 32, 64, or 128 bits wide.
The two special types are pointers and the empty type. The pointer type is wide enough to
fit a native pointer, and is intentionally distinct from `wsize` in order to support
the [CHERI](https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/) architecture
(see [this article](https://tratt.net/laurie/blog/2022/making_rust_a_better_fit_for_cheri_and_other_platforms.html)).
The empty type exists only implicitly, and has a size of zero. It is primarily used in
combination with fallible types to represent the absence of a value. No variable or immutable
can be of the empty type. It is represented as an empty block `{}`
```yaml
primitive:
algebraic:
whole: w8 w16 w32 w64 w128 wsize
integer: i8 i16 i32 i64 i128 isize
real: r16 r32 r64 r128
complex: c32 c64 c128 c256
quaternion: q64 q128 q256 q512
lexical: glyph string substr
logical: bool b8 b16 b32 b64 b128
special: ptr {}
```
## operators
```yaml
math: + - / * %
logical: not and or xor
bitwise: ~ & | ^ << >>
comparative: < <= > >= == !=
```
## variables
Variables and constants are strongly typed, but the type may be inferred or
explicit. Re-declared variables shadow the previous declaration for the current
scope. Declaration is done using `:=` and `::`. Uninitialized variables are zeroed
by default. Multiple declaration and assignment is allowed
```rust
// Implicit type
v1 := 10;
c1 :: true;
// Explicit type
v2 : i64 = 20;
c2 : bool : false;
v1 = 15; // Assignment
v3, v4 := 'π', 3.14159; // Multiple declaration
v3, v4 = 'τ', 6.28318; // Multiple assignment
{
v2 := "Hello World"; // Name shadowing
v4, v5 := 4.0, q128(1.0, 0.0, 0.0, 0.0);
}
// v2 == 20, v4 == 6.28318
```
## conditionals
There are two kinds of conditionals, if-else statements and when statements. The
when statement is similar to the switch statement in other languages. They must
exhaustively cover every case, or include a default case. Both can be used
as the right-hand side of an assignment
```rust
n := 15;
sign := if n < 0 {
print("n is negative");
-1
} else {
print("n is positive");
1
}
s := ":)"
match s {
":(" -> {
println("Sad face");
},
":)" -> {
println("Happy face");
},
":|" -> {
println("Neutral face");
},
default(val) -> {
println("I dont recognize this face:");
println(val);
}
}
```
## blocks
Blocks are surrounded by `{}` and optionally evaluate to some type. A block
may be given a name using the `@` operator. By default, blocks will capture all
variables within the scope they are declared. A block can optionally capture
only specific named variables from the scope in which it is evaluated using the
capture syntax `[]`.
The `break` keyword is used to escape a block early. A block name can be
provided to break out of multiple nested blocks at once. If the escaped block
evaluates to a type, a value must be provided to break. If the final expression
within a block does not end in a semicolon, the block will evaluate to that
expression.
```rust
sum := 0;
for x : 0..100 @x_block {
for y : 0..100 @y_block {
for z : 0..100 @z_block {
if x == 16 {
break @x_block;
}
else if y == 64 and z == 32 {
break @y_block;
}
sum += x + y + z;
}
}
}
outside := "foo";
sum2 := [^sum] { // Captures mutable reference to `sum`
/* outside = "bar"; */ // ERROR: only `sum` is captured
sum *= 2
} // == sum * 2
```
## functions
Functions and lambdas are not distinct types, and do not have distinct syntax.
A function may be declared anywhere a variable may, including inside other
functions or as a parameter. A function may have any number of ordered
parameters, and may be variadic. Functions may have default parameters,
and default parameters may refer to parameters which come before them.
Multiple values may be returned from a function
A function may accept any number of receiver types. A receiver types
implements the function as a "method". Multiple receiver parameters refer to
a parenthetical list of objects of the given types, which are order dependent.
Types can use methods defined from any receiver function within the current
scope. Receiver types can be any type, including primitives.
Function overloading is valid so long as no two functions share both a name
and call signature. Function names can also be shadowed just like variables.
A function body is a block. This means it can optionally capture local variables
from where it is called. Unlike regular blocks, function blocks do not capture
locals automatically. Functions implicitly return the last expression of their
body if it does not end in a semicolon, and can be returned early by using
`break` without a block name. Note that block captures and names are not part of
a functions signature
```rust
// Receiver types come before parameters, and may be elided
fizzbuzz :: (number: wsize) -> wsize {
sum := 0;
if number == 0 {
println("Input cannot be zero");
break sum; // Early return
}
for i : 0..number {
if (i % 3 == 0) and (i % 5 == 0) {
println("fizzbuzz");
sum += 1;
}
else if i % 3 == 0 {
println("fizz");
}
else if i % 5 == 0 {
println("buzz");
}
}
sum // return the number of fizzbuzz's
}
f1 := fizzbuzz(15);
f2 := fizzbuzz(number: 30); // Parameters may be explicitly named
// With a single receiver
fizzbuzz :: (number: wsize)() -> wsize { fizzbuzz(number) }
f3 := 15.fizzbuzz()
// With multiple receivers
fizzbuzz :: (number1: wsize, number2: wsize)() -> wsize, wsize {
number1.fizzbuzz(), number2.fizzbuzz()
}
f4, f5 := (15, 30).fizzbuzz();
// With capture
sum_locals :: () [f1: wsize, f2: wsize, f3: wsize, f4: wsize] {
f1 + f2 + f3 + f4
}
f6 := sum_locals(); // Captures local variables implicitly
```
## fallibility
A type which is followed by a question mark `?` is marked as fallible. A fallible
type may or may not contain a value, and must be checked before they are accessed.
All types implicitly convert to their fallible counterpart
```rust
something : w64? = 5;
nothing : w64? = {};
// Conditional assignment
if num := something {
print("Found a value");
} else {
print("No value exists");
}
// Most boolean operations work on fallible types
maybe = something or nothing;
```
Functions may return a fallible type. Placing a question mark after
a fallible value will immediately return from the current
function if that value is empty
```rust
divide :: (numerator: r64, divisor: r64) -> r64? {
if numerator == 0.0 {
break; // Failure case, returns the empty type
}
numerator / divisor
}
inverse_squared :: (val: r64) -> r64? {
inv : r64 = divide(1.0, val)?;
inv * inv
}
```
## generators
Capturing variables within a function allows for generator functions.
A generator function can produce different values each time it is called.
They are often used to lazily evaluate an open-ended sequence, or to iterate
over a collection. The `:` operator inside a for loop expression will call
a function repeatedly until the function fails to return a value
```rust
range (minimum: i64, maximum: i64) -> (() -> i64?) {
current := minimum;
() [minimum, maximum, current] {
last := current;
current += 1;
if current <= maximum {
break last;
} else {
break;
}
}
}
iterable := range(0, 10);
for i : iterable {
print(i);
} // Prints 0-9
iterable := 0..10; // Equivalant to range(0, 10)
```
## move, clone, refer
Be default, parameters and receivers are "moved" into the function body. This
means that they can no longer be accessed by the surrounding scope. A variable
may be cloned instead to replicate the classic C "pass by value" behavior. The
`clone` method is implemented on all primitives and may be implemented on any
user defined type.
Referenced types are prefixed by `&` or `^`, for immutable and mutable
references respectively. Cloning a reference results in a clone of the
underlying value. Receiver arguments can be either type of reference. Values
will implicitly cast to their referenced equivalent for the purpose of method
calls, but not vice-versa
```rust
value : wsize = 10;
reference : &wsize = &2;
divide :: (numerator: wsize)(divisor: &wsize) {
numerator / divisor
}
num1 := value.divide(reference); // `value` moves out of scope
num2 := num1.clone().divide(reference); // `num1` stays in scope
num3 := reference.clone().divide(&num1);
```
## aliased types
A type can be aliased, giving it a new name. The alias is treated as a
completely different type, and will not implicitly cast to the original type
```rust
my_string_alias :: string;
normal_string : string = "Hello World";
aliased_string : my_string_alias = "Goodbye World";
/* normal_string = aliased_string; */ // ERROR: incompatible types
```
## structures
A structure is defined by aliasing the `struct` type. They may contain named or
unnamed fields, but not a mix of both. Like local variables, uninitialized struct
fields are zeroed by default
```rust
// Struct with named fields
Position :: struct {
x: r64, y: r64
};
origin : Position = {x: 0.0, y: 0.0};
// With unnamed fields
Color :: struct {
r64, r64, r64
};
red : Color = {0.0, 0.0, 0.0};
// Accessing unnamed fields
red.0 = 1.0;
red.1 = 0.2;
red.2 = 0.2;
// red == {1.0, 0.2, 0.2};
```
## defer
Execution of a block can be deferred until the end of the current scope. Deferred
blocks execute in the reverse order they were declared
```rust
{
println("First");
defer { println("Fourth"); }
defer { println("Third"); }
println("Second")
}
```
## namespaces
All source files implicitly exist within their own namespace, which must be
named at the top of the file. An inner namespace can also be declared. Members
of a namespace are accessed using the dot `.` operator
```rust
ns1 :: {
foo :: () {
println("foo");
}
ns2 :: {
bar :: () {
println("bar");
}
}
}
ns1.foo();
ns1.ns2.bar();
```

View file

@ -1,2 +1,3 @@
# writings
* [language ideas](./2-language-ideas.html)
* [on build systems](./1-build-systems.html)