Code Monkey home page Code Monkey logo

enigma's Introduction

Enigma

Enigma VM

Build status Windows build status

An implementation of the Erlang VM in Rust. We aim to be complete, correct and fast, in that order of importance.

OTP 22+ compatible (sans the distributed bits for now) — all your code should eventually run on Enigma unchanged. Deprecated opcodes won't be supported.

Why?

Because it's fun and I've been learning a lot. BEAM and HiPE are awesome, but they're massive (~300k SLOC). A small implementation makes it easier for new people to learn Erlang internals. We also get a platform to quickly iterate on ideas for inclusion into BEAM.

Installation

Only prerequisite to building Enigma is Rust. Use rustup to install the latest nightly rust. At this time we don't support stable / beta anymore, because we're relying on async/await, which is scheduled to run in stable some time in Q3 2019.

To boot up OTP you will also need to compile the standard library. At the moment, that relies on the BEAM build system:

git submodule update --init --depth 1
cd otp
/otp_build setup -a
make libs
make local_setup

We hope to simplify this step in the future (once enigma can run the compiler).

Run cargo run to install dependencies, build and run the VM. By default, it will boot up the erlang shell (iex also works, but has some rendering bugs).

Expect crashes, but a lot of the functionality is already available.

Pre-built binaries for various platforms will be available, once we reach a certain level of stability.

Feature status

We implement most of the opcodes, and about half of all BIFs. You can view a detailed progress breakdown on opcodes or BIFs.

Roadmap

  • Get the full OTP kernel/stdlib to boot (init:start).
  • Get the Eshell to run.
  • Get OTP tests to run (works with a custom runner currently).
  • Get the erlang compiler to work.
  • Get IEx to run.
  • Get OTP tests to pass.

Features

  • Floating point math
  • Spawn & message sending
  • Lambdas / anonymous functions
  • Exceptions & stack traces
  • Process dictionary
  • Signal queue
  • Links & monitors
  • Timers
  • Maps
  • Binaries
  • File IO
    • open/read/close/read_file/write
    • Filesystem interaction
  • External NIFs
  • Ports (might never be fully supported, we provide a few boot-critical ones as builtins: tty, fd)
  • External Term representation
    • Decoding
    • Encoding
  • ETS
    • PAM implementation
    • All table types partially, but we do not provide any concurrency guarantees
  • Regex (some support exists for basic matching)
  • Garbage Collection (arena-based per-process at the moment)
  • inet via socket nifs
  • Code reloading
  • Tracing/debugging support
  • Load-time instruction transform
  • Load-time instruction specialization engine

Goals, ideas & experiments

Process scheduling is implemented on top of rust futures:

  • A process is simply a long running future, scheduled on top of tokio-threadpool work-stealing queue
  • A timer is a delay/timeout future relying on tokio-timer time-wheel
  • Ports are futures we can await on
  • File I/O is AsyncRead/AsyncWrite awaitable
  • NIF/BIFs are futures that yield at certain points to play nice with reductions (allows a much simpler yielding implementation)

Future possibilities:

  • Write more documentation about more sparsely documented BEAM aspects (binary matching, time wheel, process monitors, ...)
  • Explore using immix as a GC for Erlang
  • Eir runtime
  • JIT via Eir
  • BIF as a generator function (yield to suspend/on reduce)
  • Provide built-in adapter modules for hyper as a Plug Adapter / HTTP client.
  • Cross-compile to WebAssembly (runtime)

Initial non-goals

Until the VM doesn't reach a certain level of completeness, it doesn't make sense to consider these.

  • Distributed Erlang nodes
  • Tracing / debugging support
  • BEAM compatible NIFs / FFI

Note: NIF/FFI ABI compatibility with OTP is going to be quite some work. But, a rust-style NIF interface will be available. It would also probably be possible to make an adapter compatible with rustler.

Contributing

Contributors are very welcome!

The easiest way to get started is to look at the notes folder and pick a BIF or an opcode to implement. Take a look at src/bif.rs and the bif folder on how other BIFs are implemented. There's also a few issues open with the good first issue tag, which would also be a good introduction to the internals.

Alternatively, search the codebase for TODO, FIXME or unimplemented!, those mark various places where a partial implementation exists, but a bit more work needs to be done.

Test coverage is currently lacking, and there's varying levels of documentation; I will be addressing these soon.

We also have a #enigma channel on the Elixir Slack.

enigma's People

Contributors

archseer avatar icedragon200 avatar nobbz avatar potatosalad avatar ryansiu1995 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

enigma's Issues

Boolean logic BIFs

  • ubif erlang:’and’/2
  • ubif erlang:’or’/2
  • ubif erlang:’xor’/2
  • ubif erlang:’not’/1
  • ubif erlang:’>’/2 sgt_2
  • ubif erlang:’>=’/2 sge_2
  • ubif erlang:’<’/2 slt_2
  • ubif erlang:’=<’/2 sle_2
  • ubif erlang:’:‘/2 seq_2
  • ubif erlang:’==’/2 seqeq_2
  • ubif erlang:’/‘/2 sneq_2
  • ubif erlang:’/=’/2 sneqeq_2

Look at the related opcode implementations (Opcode::IsLe, etc.)

no_std compatibility

We currently rely on some parts of std, but having a no_std build would be interesting for embedded devices and WebAssembly. Note that a lot of the std stuff exists in core, so we could get far with just conditional imports.

https://github.com/serde-rs/serde/blob/bb99b31eb0a55393101f9c80cd959b3a739ad70f/serde/src/lib.rs#L141-L143

  • Fix all the imports to import core:: + alloc versions on the no_std feature flag.
  • Make sure all the dependencies are no_std capable.
  • Move any OS specific code to a sys subfolder.
  • Abstract the tokio runtime so we can switch to juliex or any other no_std capable executor. Look into fuchsia's executor as an example.

Fill in the unit test of implemented bif

Some of the unit tests are not implemented yet.
Required someone to fill them up.

  • Math functions
  • List functions
  • Chrono functions

Here are the incomplete unit tests (without testing the exception case).

  • erlang functions

Conversion functions

It's time to get all these out of the way.

x_to_list

  • erlang:atom_to_list/1
  • erlang:binary_to_list/1
  • erlang:binary_to_list/3
  • erlang:float_to_list/1
  • erlang:float_to_list/2
  • erlang:integer_to_list/1
  • erlang:integer_to_list/2
  • erlang:pid_to_list/1
  • erlang:tuple_to_list/1
  • erlang:ref_to_list/1
  • erlang:port_to_list/1
  • erlang:fun_to_list/1
  • erlang:bitstring_to_list/1

list_to_x

  • erlang:list_to_atom/1
  • erlang:list_to_binary/1
  • erlang:list_to_float/1
  • erlang:list_to_integer/1
  • erlang:list_to_pid/1
  • erlang:list_to_port/1
  • erlang:list_to_ref/1
  • erlang:list_to_tuple/1
  • string:list_to_integer/1
  • erlang:list_to_integer/2
  • string:list_to_float/1
  • erlang:list_to_existing_atom/1
  • erlang:list_to_bitstring/1

binary_to_x

  • erlang:binary_to_float/1
  • erlang:binary_to_term/1
  • erlang:binary_to_atom/2
  • erlang:binary_to_existing_atom/2
  • erlang:binary_to_integer/1
  • erlang:binary_to_integer/2

x_to_binary

  • erlang:atom_to_binary/2
  • erlang:float_to_binary/1
  • erlang:float_to_binary/2
  • erlang:integer_to_binary/1
  • erlang:integer_to_binary/2
  • erlang:term_to_binary/1
  • erlang:term_to_binary/2

Special

  • unicode:characters_to_binary/2
  • unicode:characters_to_list/2
  • erlang:iolist_to_binary/1
  • erlang:iolist_to_iovec/1

Travis Integeration

I think it is right time to get the travis involved.
At first, I think we can build and test with stable version of rust.
The pull request is required to pass all the unit tests then.

Unable to compile

Hi @archseer,

I'm very new to Rust (familiar with Erlang), so please forgive me.
I am attempting to compile Enigma and receiving the following messageto use a constant of type value::Term in a pattern, value::Term must be annotated with #[derive(PartialEq, Eq)].
From what I can tell this seems to be related to the APPLY_2 const but again, I'm not familiar at all with Rust.
Any idea what is causing this?

rustc 1.39.0-nightly (72b2abfd6 2019-08-29)
rustup 1.18.3 (435397f48 2019-05-22)
cargo 1.39.0-nightly (22f7dd049 2019-08-27)

Timers

Timers includes:

  • bif erlang:start_timer/3
  • bif erlang:start_timer/4
  • bif erlang:send_after/3
  • bif erlang:send_after/4
  • bif erlang:cancel_timer/1
  • bif erlang:cancel_timer/2
  • bif erlang:read_timer/1
  • bif erlang:read_timer/2

As well as WaitTimeout and Send with timeout instructions. The implementation should use a hierarchical timing wheel, can probably base the implementation on https://github.com/tokio-rs/tokio/tree/master/tokio-timer/src/wheel

If we decide to integrate with Tokio for the async I/O (yields the BIF when blocking), we could avoid copying the impl. If we copy it, we probably want a new thread that will keep track of the wheel and schedule processes when their timers are up (similar to what process::spawn internals do)

Error handler

We need to place stubs in the exports registry for modules that aren't loaded yet, but referenced elsewhere. These should call the error handler, (which in the erlang stdlib will try to load the module or error).

http://erlang.org/doc/man/error_handler.html

Search for references to error_handler inside beam_emu.c and enigma for a quickstart.

Tuple BIFs

  • bif erlang:make_tuple/2
  • bif erlang:append_element/2
  • bif erlang:make_tuple/3
  • bif erlang:setelement/3
  • bif erlang:tuple_to_list/1

Most of these can be done in a very similar way to the tup macros:

let tuple = value::tuple($heap, 2);
unsafe {
std::ptr::write(&mut tuple[0], $element1);
std::ptr::write(&mut tuple[1], $element2);
}
Term::from(tuple)

For setelement/any functions that copy the old tuple, we can 1) allocate a new tuple, 2) use std::ptr::copy_nonoverlapping to do a fast memcpy of the old tuple to the new tuple 3) set the new elements

Testing macro for BIF

I think we can trim the testing code with a macro for calling the bif function.
During my development, the following codes were repeatedly entered.

let vm = vm::Machine::new();
let module: *const module::Module = std::ptr::null();
let process = process::allocate(&vm.state, module).wrap();
let args = vec![xxxxxxx];
let res = bif_the_function_I_want_to_test(&vm, &process, &args);

Most of the codes are not used for testing. The code duplication is so high that I think it can be replaced by a macro as a high order function like the following one.

macro_rules! test_it(func: func ; $($arg: expr), *) => {
  // The code as the above
}

I can do this in Elixir but not sure if it is valid in Rust.
Does it sound a valid solution to you? @archseer
If it is okay, I will do it and refactor the unit tests.

ETF encoding

We currently only do (partially) ETF decoding. This isn't important just yet, but we'll need to implement encoding in the future. We should aim to implement an encoder inside etf.rs, then expose that as a serde protocol. Take inspiration from https://github.com/obmarg/serde_eetf

Fail to build otp due to missing libraries

After Issue #34 was resolved, I was able to update otp and start the build. However, the build failed due to the following errors:

=== Entering application tools
make[3]: Entering directory '/home/yima/play/enigma/otp/lib/tools/c_src'
MAKE opt
make[4]: Entering directory '/home/yima/play/enigma/otp/lib/tools/c_src'
CC ../obj/x86_64-unknown-linux-gnu/opt/emem/erl_memory.o
CC ../obj/x86_64-unknown-linux-gnu/opt/emem/erl_memory_trace_block_table.o
LD ../bin/x86_64-unknown-linux-gnu/emem
/usr/bin/ld: cannot find -lerts_r
/usr/bin/ld: cannot find -lethread
/usr/bin/ld: cannot find -lerts_internal_r
collect2: error: ld returned 1 exit status
x86_64-unknown-linux-gnu/Makefile:182: recipe for target '../bin/x86_64-unknown-linux-gnu/emem' failed
make[4]: *** [../bin/x86_64-unknown-linux-gnu/emem] Error 1

Fail to update submodule otp

I am getting this when I follow the instruction to update otp. Do we have a workaround?

~/play/enigma$ git submodule update
fatal: reference is not a tree: 7bcc044cb00b8dfb8eb2eb220dbda8a395af72a7
Unable to checkout '7bcc044cb00b8dfb8eb2eb220dbda8a395af72a7' in submodule path 'otp'

Separate the bif implementation and bif.rs

Currently, some of the bif implementation is written in bif.rs and some of them are in bif folder.
We can gather them into bif folder instead.
Then, bif.rs can be served as an index of the current bif and file in bif are the implementation with unit tests.

  • erlang function
  • math function
  • list function

Rebase problem in maps

Hi @archseer,
it seems the code in maps is being rebased.
Some of the implementation is cleared. Can you please help fix the problem?
Thank you!

Code reloading / Module versions

Update the module registry to store RcModule (Arc) instead of Box, and update all *const Module to be RcModule. Loading the same module into the VM should swap it in the registry, meaning all new module registry lookups will return the new module. Meanwhile, references to the old Module will still work until they all die, at which point the module will be dropped.

to_opcode unsafe / undefined behavior

Just saw this project on hackernews and noticed that to_opcode is a problem. If the input falls outside of the opcode range it produces undefined behavior. This probably needs to either assert with a panic or return Option for unknown opcodes.

Regex

This issue concerns the re module.

We want to use the regex crate, because it's mostly PCRE compatible, but drops lookaheads and some other features to make it execute in linear time. This way we'd be able to estimate the reductions based on the pattern and binary length.

This issue first requires binaries to be implemented (refs #20).

Tests cannot be run

Running cargo test from root directory results in the following error:

output
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package:   /home/cgm/devel/rust/enigma/enigma/Cargo.toml
workspace: /home/cgm/devel/rust/enigma/Cargo.toml
   Compiling quickcheck v0.9.0
   Compiling instruction-codegen v0.1.0 (/home/cgm/devel/rust/enigma/instruction-codegen)
error[E0433]: failed to resolve: use of undeclared type or module `trybuild`
 --> instruction-codegen/tests/progress.rs:3:13
  |
3 |     let t = trybuild::TestCases::new();
  |             ^^^^^^^^ use of undeclared type or module `trybuild`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0433`.
error: could not compile `instruction-codegen`.
warning: build failed, waiting for other jobs to finish...
error[E0432]: unresolved import `instruction`
  --> instruction-codegen/tests/01-parse-header.rs:23:5
   |
23 | use instruction::ins;
   |     ^^^^^^^^^^^ use of undeclared type or module `instruction`

error: cannot determine resolution for the macro `ins`
  --> instruction-codegen/tests/01-parse-header.rs:25:1
   |
25 | ins!(
   | ^^^
   |
   = note: import resolution is stuck, try simplifying macro imports

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0432`.
error: could not compile `instruction-codegen`.
warning: build failed, waiting for other jobs to finish...
error: build failed

GC

Probably either immix or orca. Each process gets it's own heap consisting of blocks that we independently GC.

At the moment we just use blocks as a sort of an arena GC: they belong to a process and get deallocated when a process drops (though no destructors are called, and we need to ensure the process unregisters in the process registry!!!)

Immix

Orca

Golang GC

Signal handling / Monitors & links

Monitors & links

  • Track parent process
  • Add a link/monitor tree to each process
  • Modify process::spawn to accept extra flags which register the required tree structures
  • On process exit, dispatch exit signals

Signal queue

  • Modify mailbox to hold a queue of enum Signal { Message(Term), Exit(Reason), ... }
  • Check the mailbox on each process swap-in / (swap-out?)

Multiple build errors

When I ran "cargo run", once it got into compiling enigma, it encountered multiple errors, such as:

error[E0432]: unresolved import byteorder
--> src/etf.rs:313:5
|
313 | use byteorder::{BigEndian, WriteBytesExt};
| ^^^^^^^^^ help: a similar path exists: libflate::byteorder

error[E0433]: failed to resolve: use of undeclared type or module md5
--> src/bif/erlang.rs:14:18
|
14 | let digest = md5::compute(bytes);
| ^^^ use of undeclared type or module md5
error[E0599]: no method named write_u8 found for type &mut std::vec::Vec<u8> in the current scope
--> src/etf.rs:330:21
|
330 | res.write_u8(i as u8)?;
| ^^^^^^^^
|
= help: items from traits can only be used if the trait is in scope
= note: the following trait is implemented but not in scope, perhaps add a use for it:
use byteorder::io::WriteBytesExt;

The last error occurs on multiple lines in etf.rs.

In addition, many warning about unused import and macro definition, such as:

warning: unused macro definition
--> src/exception.rs:208:1
|
208 | / macro_rules! native_exception {
209 | | ($x:expr) => {
210 | | $x & Reason::EXF_NATIVE
211 | | };
212 | | }
| |_^

warning: unused import: BigInt
--> src/bif.rs:11:26
|
11 | use crate::value::{self, BigInt, Cons, Term, CastFrom, CastInto, Tuple, Variant};
| ^^^^^^
|
= note: #[warn(unused_imports)] on by default

And warnings about deprecated features, such as:

warning: ... range patterns are deprecated
--> src/bif/erlang.rs:230:51
|
230 | Variant::Integer(i @ 0...255) => {
| ^^^ help: use ..= for an inclusive range
|
= note: #[warn(ellipsis_inclusive_range_patterns)] on by default
warning: trait objects without an explicit dyn are deprecated
--> src/ets/error.rs:88:32
|
88 | fn cause(&self) -> Option<&StdError> {
| ^^^^^^^^ help: use dyn: dyn StdError

Binaries

  • Binary matching opcodes
  • Binary building opcodes
  • NIFs

Submodule failing to fetch unadvertised object

Seems a similar problem to #34

When I run git submodule update --init -depth 1 I get the following error:

Submodule 'otp' (https://github.com/erlang/otp) registered for path 'otp'
Cloning into '/home/cgm/devel/rust/enigma/otp'...
remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
error: Server does not allow request for unadvertised object 9c35d42191003d1325ad99189d7327f87c9f6fd4
Fetched in submodule path 'otp', but it did not contain 9c35d42191003d1325ad99189d7327f87c9f6fd4. Direct fetching of that commit failed.

Loader transform engine

Beam implements a transform engine that transforms beam opcodes during the load phase, that way they can implement optimizations that are independent of the OTP version.

https://github.com/erlang/otp/blob/master/erts/emulator/internal_doc/beam_makeops.md

The problem is, it's a large perl script that generates a bunch of C arrays. The loader then uses that array as a series of simple instructions in state machine. The state machine part should be easy, for the generator, we might just fork the perl script to generate rust arrays. Or implement some form of a rust codegen.

https://github.com/erlang/otp/blob/49eb25c80514b1e221cc516c423bbc45ba7cc107/erts/emulator/beam/beam_load.c#L5207

This is a time consuming issue, but will yield a lot of performance improvements (optimize bif calls, instruction combining, etc).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.