Code Monkey home page Code Monkey logo

rellume's Introduction

Rellume โ€” Lift machine code to LLVM IR

Rellume is a lifter for x86-64/AArch64/RISC-V64 machine code to LLVM IR with focus on the performance of the lifted code. The generated LLVM IR can be compiled and executed again, for example using LLVM's JIT compiler, ideally having the same (or even better) performance as the original code. Special care is taken to model the SIMD instructions and pointers in a way that the optimizer can generate efficient code. The lifter operates on a set of specified instructions (or decodes the control flow automatically) and creates an LLVM-IR function with the same semantics. These functions operate on a generic structure containing the virtual CPU state, but can be wrapped for an arbitrary calling convention.

Use Cases

  • Binary rewriting:
    • Binary Translation: translating machine code to a different architecture while making use of compiler optimizations. This is implemented in Instrew.
    • Performance improvement: specialization for runtime data, e.g. known parameters or memory locations. This is implemented in the LLVM back-end of BinOpt.
    • Instrumentation: insert tracing and interception code in hot code paths, where high quality machine code is required.
  • Binary analysis: existing tooling for analysis of LLVM IR code can be re-used for binary code.

Example

See examples/ for usage examples.

Publications

  • Alexis Engelke. Optimizing Performance Using Dynamic Code Generation. Dissertation. Technical University of Munich, Munich, 2021. (Thesis)
  • Alexis Engelke and Martin Schulz. Instrew: Leveraging LLVM for High Performance Dynamic Binary Instrumentation. VEE'20, March 2020. (Please cite this paper when referring to Rellume.)
  • Alexis Engelke and Josef Weidendorfer. Using LLVM for Optimized Light-Weight Binary Re-Writing at Runtime. In Proceedings of the 22nd int. Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2017). Orlando, US, 2017 (PDF of pre-print version)

License

LGPLv2.1+

rellume's People

Contributors

aengelke avatar okitec avatar timafrolov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rellume's Issues

How to use rellume?

How to use this tool for binary lifting to IR codes?
I cannot find the content which describe the installation and usage.

x86?

is there any plan to support x86 or arm?

More advanced examples?

Hi! Do you have examples that go beyond a single function as byte code? I'm wondering where to start with an entire binary. Thanks!

Thoughts on Anvill?

Hello,

Maintainer of McSema, Remill, and Anvill here :-) Your project is very interesting, and we take a similar approach to yours in one of our research projects, Anvill. Anvill like McSema, uses Remill for instruction semantics, and so we can apply it to 32-bit x86, as well as AArch64 machine code. We've found that using an SROA-based approach, similar to what you do, and inspired by the SATURN deobfuscator, generates very nice bitcode. What are your thoughts on Anvill, and are you interested in collaboration?

CreateInsertElement is ambiguous

Hello,

I get an error while compiling:

../src/a64/main.cc: In member function 'void rellume::aarch64::Lifter::SetScalar(farmdec::Reg, llvm::Value*)':
../src/a64/main.cc:1032:38: error: call of overloaded 'CreateInsertElement(llvm::Value*&, llvm::Value*&, long unsigned int)' is ambiguous
1032 | fullvec = irb.CreateInsertElement(fullvec, val, 0uL);
| ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
../src/x86-64/lifter-operand.cc: In member function 'void rellume::x86_64::Lifter::OpStoreVec(rellume::Instr::Op, llvm::Value*, bool, rellume::x86_64::Alignment)':
../src/x86-64/lifter-operand.cc:276:39: error: call of overloaded 'CreateInsertElement(llvm::Value*&, llvm::Value*&, long unsigned int)' is ambiguous
276 | full = irb.CreateInsertElement(full, value, 0ul);
| ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
../src/x86-64/lifter-sse.cc: In member function 'void rellume::x86_64::Lifter::LiftSseMovScalar(const rellume::Instr&, rellume::Facet)':
../src/x86-64/lifter-sse.cc:126:52: error: call of overloaded 'CreateInsertElement(llvm::Value*&, llvm::Value*&, long unsigned int)' is ambiguous
126 | llvm::Value* zext = irb.CreateInsertElement(zero, src, 0ul);

LLVM version 15.0.7, what could be the problem?

note: candidate: 'llvm::Value* llvm::IRBuilderBase::CreateInsertElement(llvm::Value*, llvm::Value*, llvm::Value*, const llvm::Twine&)'
2298 | Value *CreateInsertElement(Value *Vec, Value *NewElt, Value *Idx,

Thanks!

Test error of example codes

Hi,

I have cloned yoor code and built it. However, the LLVM-IR generated from example program can't be re-compile again.

The generated LLVM-IR as bellow:

; Function Attrs: null_pointer_is_valid
define void @0(ptr noalias nocapture align 16 dereferenceable(400) %0) #0 {
  %2 = getelementptr i8, ptr %0, i64 0
  %3 = getelementptr i8, ptr %0, i64 8
  %4 = getelementptr i8, ptr %0, i64 16
  %5 = getelementptr i8, ptr %0, i64 24
  %6 = getelementptr i8, ptr %0, i64 32
  %7 = getelementptr i8, ptr %0, i64 40
  %8 = getelementptr i8, ptr %0, i64 48
  %9 = getelementptr i8, ptr %0, i64 56
  %10 = getelementptr i8, ptr %0, i64 64
  %11 = getelementptr i8, ptr %0, i64 72
  %12 = getelementptr i8, ptr %0, i64 80
  %13 = getelementptr i8, ptr %0, i64 88
  %14 = getelementptr i8, ptr %0, i64 96
  %15 = getelementptr i8, ptr %0, i64 104
  %16 = getelementptr i8, ptr %0, i64 112
  %17 = getelementptr i8, ptr %0, i64 120
  %18 = getelementptr i8, ptr %0, i64 128
  %19 = getelementptr i8, ptr %0, i64 136
  %20 = getelementptr i8, ptr %0, i64 137
  %21 = getelementptr i8, ptr %0, i64 138
  %22 = getelementptr i8, ptr %0, i64 139
  %23 = getelementptr i8, ptr %0, i64 140
  %24 = getelementptr i8, ptr %0, i64 141
  %25 = getelementptr i8, ptr %0, i64 142
  %26 = getelementptr i8, ptr %0, i64 144
  %27 = getelementptr i8, ptr %0, i64 152
  %28 = getelementptr i8, ptr %0, i64 160
  %29 = getelementptr i8, ptr %0, i64 176
  %30 = getelementptr i8, ptr %0, i64 192
  %31 = getelementptr i8, ptr %0, i64 208
  %32 = getelementptr i8, ptr %0, i64 224
  %33 = getelementptr i8, ptr %0, i64 240
  %34 = getelementptr i8, ptr %0, i64 256
  %35 = getelementptr i8, ptr %0, i64 272
  %36 = getelementptr i8, ptr %0, i64 288
  %37 = getelementptr i8, ptr %0, i64 304
  %38 = getelementptr i8, ptr %0, i64 320
  %39 = getelementptr i8, ptr %0, i64 336
  %40 = getelementptr i8, ptr %0, i64 352
  %41 = getelementptr i8, ptr %0, i64 368
  %42 = getelementptr i8, ptr %0, i64 384
  %43 = getelementptr i8, ptr %0, i64 400
  %44 = load i64, ptr %0, align 4
  %45 = load i64, ptr %10, align 4
  %46 = load i64, ptr %9, align 4
  %47 = load i64, ptr %7, align 4
  br label %48

48:                                               ; preds = %1
  %49 = sub i64 %45, %46
  %50 = icmp slt i64 %49, 0
  %51 = icmp eq i64 %45, %46
  %52 = icmp slt i64 %45, %46
  %53 = icmp ne i1 %50, %52
  %54 = icmp ne i1 %50, %53
  %55 = xor i1 %54, true
  br i1 %55, label %60, label %59

56:                                               ; preds = %60
  %57 = phi i64 [ %65, %60 ]
  store i64 %57, ptr %0, align 4
  store i64 %61, ptr %3, align 4
  %58 = ptrtoint ptr %64 to i64
  store i64 %58, ptr %7, align 4
  ret void

59:                                               ; preds = %48
  br label %60

60:                                               ; preds = %59, %48
  %61 = phi i64 [ %45, %48 ], [ %46, %59 ]
  %62 = phi i64 [ %47, %48 ], [ %47, %59 ]
  %63 = inttoptr i64 %62 to ptr
  %64 = getelementptr i64, ptr %63, i64 1
  %65 = load i64, ptr %63, align 4
  br label %56
}

The clang reports error:

(.text+0x17): undefined reference to `main'

It's obvious that something wrong with your generate process regarding function name. BTW, the phi node %57 is also incorrect, which expected incoming values. Will you fix these bugs in the future?

Small fixes

Project fails to build on my system due to two reasons:

  1. I have llvm 14.0.6 and current meson build config refuses to work with llvm <14. If rellume is strongly bound to llvm version < 14, I suggest adding this information to README.md, otherwise fix the llvm_version variable in topmost meson.build
  2. Meson build system refuses to compile project with the following error:
    examples/meson.build:1:0:` ERROR: No host machine compiler for 'examples/simple-x86-64.c'.
    Quick googling led me to conclusion that i should fix first line to
    project('rellume', ['cpp', 'c'], meson_version: '>=0.49',
    Attached file contains proposed fixes (.txt part was added due to upload restrictions)
    meson.build.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.