This repository contains AuDaLaC, the compiler for the Autonomous Data Language (AuDaLa) that I wrote as part of my 2023 Embedded Systems master thesis "Towards relaxed memory semantics for the Autonomous Data Language". This work was done as part of the Formal System Analysis research group of the Technical University Eindhoven. I was supervised by dr. ir. Thomas Neele.
This work presents an alternative operational semantics for the Autonomous Data Language (AuDaLa) with relaxed memory consistency and incoherent memory. We show how the memory operations of our semantics can be safely mapped onto the NVIDIA PTX virtual ISA and demonstrate that our semantics performs faster than the original when executing AuDaLa programs on GPUs. We translate our operational semantics into an axiomatic memory consistency model and formally check, for a bounded program size, its correspondence with PTX’s memory consistency model using the Alloy model finder. We conclude by presenting AuDaLaC, our compiler targeting the CUDA platform, with which we explore several different strategies to compile AuDaLa programs. We demonstrate in several case studies that AuDaLa implementations can perform faster than sequential implementations.
The CUDA toolkit needs to be installed. The GPU should support at least architecture -arch=sm_60
with support for Unified Virtual Address Space.
The Rust programming language should be installed (with cargo) to build the compiler.
Basic usage:
cargo run -- MyAuDaLaFile.adl -o MyAuDaLaFile.cu
The resulting CUDA file (.cu) can be compiled with NVCC. See the Makefiles for examples. The resulting binary will ask for a .init
file, see this file for an example that defines the parity game on slide 23 of these slides.
Useful options include --memorder
, --fp_strat
, --schedule_strat
, and --time
. When debugging a program that will not terminate, the --printunstable
option can be useful.
When a program creates more instances than specified in the supplied .init
file, memory should be allocated for these instances through the --nrofinstances
option. Like so:
cd tests/benchmarks/SCC
cargo run -- SCC_FB.adl -o SCC_FB.cu --nrofinstances NodeSet=10000
Run cargo run -- -h
to see all CLI options.
cd tests/benchmarks/SCC
cargo run -- SCC_FB.adl -o SCC_FB.cu --nrofinstances NodeSet=10000
make SCC_FB.out
./SCC_FB.out testcases/random_fb_1000_1289.init
cargo run -- SCC_COL.adl -o SCC_COL.cu
make SCC_COL.out
./SCC_COL.out testcases/random_col_1000_1289.init
cd tests/benchmarks/SPM
cargo run -- SPM.adl -o SPM.cu
make SPM.out
./SPM.out testcases/dining/dining_4.invariantly_inevitably_eat.init
cd tests/benchmarks/prefix_sum
cargo run -- prefix_sum.adl -o prefix_sum.cu
make prefix_sum.out
./prefix_sum.out testcases/prefix_sum_1000.init
cd tests/benchmarks/synthesis
cargo run -- synthesis.adl -o synthesis.cu
make synthesis.out
./synthesis.out testcases/synthesis_1000_1280.init
To run the unittests, simply run cargo test
. If the benchmarks should also be run, run:
export BENCHMARK=true
cargo test -- --nocapture --test-threads=1