a small compiler with explicit architecture
quasar's a mini compiler written in rust which compiles a simple version of a c like language into arm64 assembly.
my main focus was on learning how low-level systems operate and function, their intricacies and nitty gritties.
- a lexer of regex basis
- a recursive descent parser
- specific ast
- semantic analysis having scopes
- an optimizer which does:
- constant folding
- constant propagation
- dead code elimination
- if folding
- code generation into arm64 asm
- i am on x86, i did this in cross-architecture compilation via qemu and multi-arch gdb for practice assembly files.
each stage has a strict input/output contract.
source code -> lexer -> parser -> semantic analysis -> optimizer -> codegen.
types under this:
- int
- string
- bool (internal, produced by comparisons)
statements: int x = 10; print(x + 1);
if (x > y) {
print(x);
} else {
print(y);
}
{
int x = 1;
print(x);
}
expressions having:
- integer literals
- string literals
- identifiers
- binary operators: + - < >
- assignment expressions
- variables must be declared before their use, ofc
- cant redeclare in the same scope
- int variable requires int expression for initialization
- if conditions have to be boolean
- boolean values will be produced by comparison operators.
semantic errors are collected and reported all at once.
the optimizer operates purely on the ast and runs multiple passes until convergence.
optimizations include:
- constant folding
- constant propagation
- algebraic simplification
- dead code elimination
- if condition folding
target architecture is arm64 (aarch64) assembly
design:
- stack-based variable allocation
- simple temporary register pool
- booleans lowered as 0 / 1
- printf via system abi
the code generator assumes the ast is semantically valid.
lexing: 15ms parsing: 30us semantic: 400us optimization: 200us codegen: 300us assemble: 350ms runtime: 25ms
requirements:
- rust
- aarch64-linux-gnu-gcc
- qemu-aarch64
commands: cargo run cargo run --release
generated files:
- out.s arm64 assembly
- out executable binary
https://rasmalai123.medium.com/compiler-b9a614f9ef7b
i have upgraded my compiler since this blog, the code / design choices might not be the same.