Code Monkey home page Code Monkey logo

minipython-compiler's Introduction

minipython-compiler

A slightly optimizing compiler for minipython (see here for an interpreter). Minipython code is parsed (correctly, with indentation, this time), then translated to a IR. The translation stage applies some very basic optimizations and infers the creation time of variables. From the IR, C code is generated (only depends on stdio.h).

CLI

MiniPython compiler
Compiles MiniPython programs

USAGE:
    minipython-c.exe [OPTIONS] <FILE>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -o, --out <FILE>    Sets the output file name

ARGS:
    <FILE>    Input file

Limitations

The indentation-based syntax is fully supported this time (no comments at block ends needed). Functions can only be declared at the top-level and cannot be nested. Error reporting is very basic: No location information (after parser errors) and only the first error is usually reported. The datatype used in C is unsigned long long int and is defined to be at least 64-bit in size. Still, for a language that has to encode everything as integers this is rather limiting. Performance is quite good, as long as the C compiler is used with -O3. Not because the minipython compiler is smart, but because C compilers are really smart.

Conclusion

Minipython was choosen by me as a simple project to learn how to write a compiler in Rust.

Overall, the experience was decently smooth. LALRPOP is a really great parser generator. It's decently easy to use, with good error messages, and thanks to cargo very well-integrated into the build process. The only caveat was that I had to write my own lexer, but that was necessary anyways, because I processed indentation in the lexer to let the parser stay Type-2. Writing a lexer took some time due to the complexity of parsing indentation, comments etc. correctly, but Rust turns out to be a language well suited to the task, thanks to the wide support of control flow (returns from loop etc.) and thanks to iterators.

At the same time, Rust's preference for mutable strings and slices makes writing the AST a bit more awkward, because it either requires keeping the entire source code just for the slices into it, frequent cloning, or string interning. I went with the last solution. While string interning is good for performance, it also makes printing, debugging and testing a bit harder. Cloning would have probably been the better choice, but Rust inspired me to be very performance-aware, even if it is really unneccessary. The usual caveat with premature optimization applies especially to Rust, a language that is very explicit about performance costs.

Working with trees (the AST and the IR tree) in Rust is somewhat hard due to the lifetime semantics and requires some cloning (or a different strategy, like a memory arena). Still, Rust's enums and pattern matching are comparable to ML-family functional languages, and this is a definitive advantage when writing compilers. Rust just makes working with naively-modeled trees hard, and that is a disadvantage when writing optimizations, analysis phases and AST-to-IR conversions.

Generating code in C was obviously quite easy due to the limited nature of minipython. I would have preferred to compile to LLVM via inkwell, but I sadly couldn't get it to work on Windows, so I decided to compile to C instead.

minipython-compiler's People

Contributors

spacialcircumstances avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.