Code Monkey home page Code Monkey logo

asm_block's Introduction

Assembly Block

crate Docs Apache2/MIT licensed Build Status

This crate provides a macro to translate tokens into a string which mostly works in core::arch::asm! macro.

Add

cargo add asm_block

Example

use asm_block::asm_block;

macro_rules! f {
    ($a: tt, $b: tt, $c: tt, $d: tt, $k: tt, $s: literal, $t: literal, $tmp: tt) => {
        asm_block! {
            mov $tmp, $c;
            add $a, $k;
            xor $tmp, $d;
            and $tmp, $b;
            xor $tmp, $d;
            lea $a, [$a + $tmp + $t];
            rol $a, $s;
            add $a, $b;
        }
    };
}

asm!(
    f!(eax, ebx, ecx, edx, [ebp + 4], 7, 0xd76aa478, esi),
    f!({a}, {b}, {c}, {d}, {x0}, 7, 0xd76aa478, {t}),
    f!({a:e}, {b:e}, {c:e}, {d:e}, [{x} + 4], 7, 0xd76aa478, {t:e}),
    
    a = out(reg) _,
    b = out(reg) _,
    c = out(reg) _,
    d = out(reg) _,
    x0 = out(reg) _,
    x = out(reg) _,
    t = out(reg) _,
);

Design

asm_block follows very simple rules and mostly relies on the whitespace leniency of the underlying assembler.

Transformation rules:

  • Convert ; to \n.
  • Convert ; to \n.
  • No space before and after @, :.
  • Must have a space after .<ident>.
  • Not violating the previous rule, no space before ..
  • Concatenate everything inside a pair of { and } without any space.
  • Transcribe all the other tokens as-is (by stringify!), and add a space afterwards.

This should work for most assembly code. We have checked that space after $, #, !, %, :, = won't invalidate an assembly using x86_64 target and aarch64 target.

Motivation

Consider the following code using x86_64 assembly:

unsafe fn f() -> u64 {
    let mut x = 20;
    asm!(
        ".macro mad x, y",
        "  mul x, y",
        "  lea x, [x + y]",
        ".endm",
        "mad {x}, 5",
        
        x = inout(reg) x
    );
    x
}

If we want to reuse mad in another function, we must copy the verbatim of the macro and change its name. Otherwise we will encounter compilation error due to name collision.

unsafe fn f() -> u64 {
    let mut x = 20;
    asm!(
        ".macro mad x, y",
        "  mul x, y",
        "  lea x, [x + y]",
        ".endm",
        "mad {x}, 5",
        
        x = inout(reg) x
    );
    x
}

unsafe fn g() -> u64 {
    let mut x = 10;
    asm!(
        // Only compiles if we remove this macro definition
        // or rename it to another name
        ".macro mad x, y",
        "  mul x, y",
        "  lea x, [x + y]",
        ".endm",
        "mad {x}, 8",
        
        x = inout(reg) x
    );
    x
}

The above code fails with

error: macro 'mad' is already defined

If we omit the definition of mad in g(), it will compile, but only when g() is emitted after f(). It is unclear which function should house the definition, so the only sane option is to house it in a global_asm! code. But again, it is hard to guarantee that the definition is emitted before the actual use.

It is natural to resort to Rust macro in this case, but due to the fact that asm! accepts a template string, substituting metavariables becomes tedious.

macro_rules! mad {
    ($x: ident, $y: literal) => {
        concat!(
            "mul {", stringify!($x), "}, ", stringify!($y), "\n",
            "lea {", stringify!($x), "}, [{", stringify!($x), "}+", stringify!($y), "]"
        )
    };
}

unsafe fn f() -> u64 {
    let mut x = 20;
    asm!(
        mad!(x, 5),
        
        x = inout(reg) x
    );
    x
}

This approach has some multiple drawbacks:

  • The definition is very noisy, making it hard to read and comprehend. It is much worse if the definition becomes longer, and much much worse if rustfmt attempts to format it.
  • It is easy to forget , and \n when the definition becomes longer.
  • mad! can only accept a named register as the first argument and a literal as the second argument. We cannot call mad!(x, rbx) or mad!([rax], rbp), which we would have been able to if we were using the assembler macro. Trying to fix this by changing ident and literal to tt is also problematic, since stringify!({x}) becomes "{ x }", and it is an invalid placeholder.

This crate tries to address this by providing a macro that makes it easier to compose assembly code.

use asm_block::asm_block;

macro_rules! mad {
    ($x: tt, $y: tt) => {
        asm_block! {
            mul $x, $y;
            lea $x, [$x + $y];
        }
    };
}

#[rustfmt::skip::macros(mad)]
unsafe fn f() -> u64 {
    let mut x = 20;
    asm!(
        mad!({x}, 5),
        
        x = inout(reg) x
    );
    x
}

Now we are able to make calls like mad!({x}, rbx), mad!([rax], rbp), and mad!({x:e}, [rsp - 4]). And this looks much cleaner.

Limitations

  • Due to the tokenization rule of Rust macro, strings enclosed by ' are not supported.
  • asm_block! mostly consumes tokens one by one, so it is possible to run out of recursion limit if the assembly code is long. User needs #![recursion_limit = "<a_larger_value>"] when encountering the error.
  • rustfmt will format mad!({x}, 5) into mad!({ x }, 5). While this won't make any difference in the emitted assembly code, it is confusing to read when the user is expecting a format placeholder. User can use #[rustfmt::skip::macros(mad)] to prevent rustfmt from formatting the interior of mad! calls.
  • Some assemblers use ; as the comment starter, but we are using it as instruction delimeter, so assembly comments may not work properly. Users are strongly suggested to stick to Rust comments.
  • tt cannot capture multiple tokens, so to make mad!(dword ptr [rax], ebp) possible, calling convention of mad! needs to be changed. For example
    use asm_block::asm_block;
    
    macro_rules! mad {
        ([{ $($x: tt)+ }], $y: tt) => {
            asm_block! {
                mul $($x)+, $y;
                lea $($x)+, [$($x)+ + $y];
            }
        };
        ($x: tt, $y: tt) => { mad!([{ $x }], $y) };
    }
    But mad! must be called with mad!([{ dword ptr [rax] }], ebp) instead.
  • Currently we don't have an escape hatch to manually inject assembly if the macro is not able to emit the correct assembly code.

License

Licensed under either of:

at your option.

asm_block's People

Contributors

johnmave126 avatar

Stargazers

Muhammad Ragib Hasin avatar Sematre avatar Willi Kappler avatar Yuan-Man avatar Jeron Aldaron Lau avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.