Code Monkey home page Code Monkey logo

rellic's People

Contributors

2over12 avatar alessandrogario avatar artemdinaburg avatar cypok avatar ekilmer avatar frabert avatar konchunas avatar meme avatar ninja3047 avatar oldsj avatar pgoodman avatar rustomas avatar surovic avatar tetsuo-cpp avatar xlauko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rellic's Issues

IRToASTVisitor.cpp: Check failed: array->isString() ConstantArray is not a string. SIGABRT (Abort)

I tried to compile SQLite with Clang to LLVM IR and then decompile it using Rellic. I expect there to be more issues than one before this is possible.

This is the first issue encountered.

u@x1 ~/D/rellic_play> ./rellic-build/rellic-decomp-4.0 --input s.ll --output=s.c
F0203 08:57:37.886811 20497 IRToASTVisitor.cpp:116] Check failed: array->isString() ConstantArray is not a string
*** Check failure stack trace: ***
    @           0xa441e8  google::LogMessage::Flush()
    @           0xa4766c  google::LogMessageFatal::~LogMessageFatal()
    @           0x697936  rellic::(anonymous namespace)::CreateLiteralExpr()
fish: “./rellic-build/rellic-decomp-4.…” terminated by signal SIGABRT (Abort)

To reproduce:

wget https://www.sqlite.org/2018/sqlite-amalgamation-3250200.zip
unzip sqlite-amalgamation-3250200.zip
~/Desktop/rellic_play/rellic-build/libraries/llvm/bin/clang -S -emit-llvm -o sqlite.ll sqlite-amalgamation-3250200/sqlite3.c
~/Desktop/rellic_play/rellic-build/libraries/llvm/bin/clang -S -emit-llvm -o shell.ll sqlite-amalgamation-3250200/shell.c
~/Desktop/rellic_play/rellic-build/libraries/llvm/bin/llvm-link -o s.ll shell.ll sqlite.ll
./rellic-build/rellic-decomp-4.0 --input s.ll --output=s.c

I've attached s.ll as produced with the above steps.

Cheers,
Robin

Build fails

Hi,

Following the instructions in README, I'm getting this error when running the build.sh script:

[ 72%] Building CXX object CMakeFiles/rellic-decomp-4.0.dir/rellic/AST/Util.cpp.o
/var/tmp/xchalup4/rellic-build/libraries/llvm/bin/clang++  -DGFLAGS_IS_A_DLL=0 -DGOOGLE_GLOG_DLL_DECL="" -DNDEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -isystem /var/tmp/xchalup4/rellic-build -isystem /var/tmp/xchalup4/rellic -isystem /var/tmp/xchalup4/rellic-build/libraries/llvm/include -isystem /var/tmp/xchalup4/rellic-build/libraries/z3/include -isystem /var/tmp/xchalup4/rellic-build/libraries/glog/include -isystem /var/tmp/xchalup4/rellic-build/libraries/gflags/include  -O2 -g -DNDEBUG   -Wall -Wextra -Wno-unused-parameter -Wno-c++98-compat -Wno-unreachable-code-return -Wno-nested-anon-types -Wno-extended-offsetof -Wno-variadic-macros -Wno-return-type-c-linkage -Wno-c99-extensions -Wno-ignored-attributes -Wno-unused-local-typedef -Wno-unknown-pragmas -Wno-unknown-warning-option -fPIC -fno-omit-frame-pointer -fvisibility-inlines-hidden -fno-exceptions -fno-asynchronous-unwind-tables -Wgnu-alignof-expression -Wno-gnu-anonymous-struct -Wno-gnu-designator -Wno-gnu-zero-variadic-macro-arguments -Wno-gnu-statement-expression -gdwarf-2 -g3 -O3 -Werror -pedantic -fopenmp=libomp -std=c++11 -o CMakeFiles/rellic-decomp-4.0.dir/rellic/AST/Util.cpp.o -c /var/tmp/xchalup4/rellic/rellic/AST/Util.cpp
/var/tmp/xchalup4/rellic/rellic/AST/IRToASTVisitor.cpp:335:31: error: no member named 'indices' in 'llvm::GetElementPtrInst'
    for (auto &gep_idx : inst.indices()) {
                         ~~~~ ^
1 error generated.
make[2]: *** [CMakeFiles/rellic-decomp-4.0.dir/build.make:157: CMakeFiles/rellic-decomp-4.0.dir/rellic/AST/IRToASTVisitor.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/var/tmp/xchalup4/rellic/rellic/AST/ExprCombine.cpp:109:17: error: no matching function for call to 'ignoringParens'
            has(ignoringParens(binaryOperator(stmt().bind("binop")))))) {}
                ^~~~~~~~~~~~~~
/var/tmp/xchalup4/rellic-build/libraries/llvm/include/clang/ASTMatchers/ASTMatchers.h:728:25: note: candidate function not viable: no known conversion from
      'clang::ast_matchers::internal::BindableMatcher<clang::Stmt>' to 'const internal::Matcher<QualType>' for 1st argument
AST_MATCHER_P(QualType, ignoringParens,
                        ^
/var/tmp/xchalup4/rellic-build/libraries/llvm/include/clang/ASTMatchers/ASTMatchersMacros.h:130:32: note: expanded from macro 'AST_MATCHER_P'
  AST_MATCHER_P_OVERLOAD(Type, DefineMatcher, ParamType, Param, 0)
                               ^
/var/tmp/xchalup4/rellic-build/libraries/llvm/include/clang/ASTMatchers/ASTMatchersMacros.h:150:57: note: expanded from macro 'AST_MATCHER_P_OVERLOAD'
  inline ::clang::ast_matchers::internal::Matcher<Type> DefineMatcher(         \
                                                        ^
1 error generated.
make[2]: *** [CMakeFiles/rellic-decomp-4.0.dir/build.make:131: CMakeFiles/rellic-decomp-4.0.dir/rellic/AST/ExprCombine.cpp.o] Error 1

Handle switch in CreateEdgeCond

In https://github.com/lifting-bits/rellic/blob/master/rellic/AST/GenerateAST.cpp#L135, SwitchInst are not supported.

Compile the following with remill-clang-4.0 -emit-llvm -O3 -c -o example.bc and decompile:

#include <stdint.h>

uint32_t target(uint32_t n) {
  uint32_t mod = n % 4;
  uint32_t result = 0;

  if (mod == 0) {
    result = (n | 0xbaaad0bf) * (2 ^ n);
  } else if (mod == 1) {
    result = (n & 0xbaaad0bf) * (3 + n);
  } else if (mod == 2) {
    result = (n ^ 0xbaaad0bf) * (4 | n);
  } else {
    result = (n + 0xbaaad0bf) * (5 & n);
  }

  return result;
}

You will see something similar to (instruction print was added):

F1110 21:21:03.700402 59636 GenerateAST.cpp:159] Unknown terminator instruction: switch
*** Check failure stack trace: ***
    @          0x1b4733d  google::LogMessage::Fail()
    @          0x1b49834  google::LogMessage::SendToLog()
    @          0x1b46dbb  google::LogMessage::Flush()
    @          0x1b4a459  google::LogMessageFatal::~LogMessageFatal()
    @           0x7c039d  rellic::GenerateAST::CreateEdgeCond()
SIGABRT (Abort)

Would be willing to work on this, with some guidance.

Over-eager optimizations

Some of our optimizations are too eager and remove some needed math.

For example, something like:

  uint8_t brake_switch = (buf[4] & 0b00001100) >> 2;
...
  if (brake_switch){
...

Will decompile to:

if (((arg0[4U])) != '\x00') {

This is missing come critical shifts and bitops in the conditional.

Output improvements: large integers as hex

This is an aesthetic output improvement. When we see large integers, let's output them as hex.

Examples:

      if ((*(unsigned char *)6295756UL) != (*(unsigned char *)6295754UL)) {
        *(unsigned char *)6295810UL = '\x01';

should all be hex.

Z3ConvVisitor.cpp:157 Check failed: iter != c_decl_map.end()

Tried a simple function that gives me an odd error on Z3ConvVisitor; invocation and bitcode below:

$ ./rellic-decomp-8.0 --input /store/artem/git/test/x86.bc --output /dev/stdout

F1121 12:21:17.359514 25086 Z3ConvVisitor.cpp:157] Check failed: iter != c_decl_map.end()
*** Check failure stack trace: ***
    @           0xcc82dd  google::LogMessage::Fail()
    @           0xcca76a  google::LogMessage::SendToLog()
    @           0xcc7ced  google::LogMessage::Flush()
    @           0xccb6f9  google::LogMessageFatal::~LogMessageFatal()
    @           0x800fb1  rellic::Z3ConvVisitor::GetCValDecl()

x86.bc.gz

Aborted at 1611523302 (unix time)

after running ./rellic-build/tools/rellic-decomp-9.0 --input foo.exe.bc --output foo.c

(note that the path to the decompiler is different than mentioned in README)

i get:

error: unknown target triple '', please use -triple or -arch
*** Aborted at 1611523302 (unix time) try "date -d @1611523302" if you are using GNU date ***
PC: @                0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 21849 (TID 0x7fd16427fec0) from PID 0; stack trace: ***
    @     0x7fd1646893c0 (/home/rofl0r/ub/root/usr/lib/x86_64-linux-gnu/libpthread-2.31.so+0x153bf)
    @          0x1bcc80c (/home/rofl0r/ub/root/root/rellic-build/tools/rellic-decomp-9.0+0x1bcc80b)
    @           0x825922 (/home/rofl0r/ub/root/root/rellic-build/tools/rellic-decomp-9.0+0x825921)
Segmentation fault

ftr, foo.exe.bc was created by retdec-4.0 from a 2.8 meg win32 binary (retdec subsequently spent 5 days running on a single core in retdec-llvmir2hll which i subsequently killed, as i figured rellic might be able to do the same, and it pretty much looked like retdec was hung in an infinite loop as memory usage was constant at 32GB RAM for the whole time)

Type inference and correspondence between representations

While doing type translation and type casting C expressions I ran into a lot of trouble with different semantics of operations between Z3, LLVM IR and C. For example, C allows numeric costants to have types int, long and long long in their signed and unsigned versions. LLVM IR routinely contains numeric constants of i1 and i8 types, which would naturally map to C types like char. Another example would be the conflation of C pointers and integers into bitvector sorts when Z3 is involved. It's impossible to tell if a 64 bits wide Z3_BV_SORT is a char* or long long.

The issue becomes even more complex when typing of expressions is involved. LLVM IR has every instruction (a value) explicitly typed and this type can differ from the what the result of an equivalent C expression would be.

My proposal would be to only directly translate variable and constant types between representations (Z3, IR, C). Expression types would inferred using the type semantics of the given representation without referring to the expression types of any other representations. However the result types of expressions should correspond between the representations.

For example if an IR and i8 %a, 1, where %a is an i32, yields an i8. The equivalent C expression must yield an i8 equivalent type, namely unsigned char. So the equivalent C expression would be (unsigned char)(a & 1U).

The correspondence check can be implemented using gtest / gflag CHECK() macros.

Translate C type casts to Z3 uninterpreted functions

This is especially important when re-generating C pointers from Z3 expressions. Currently the C type information is lost when translated to Z3 formulae. An example solution would be:

(unsigned char *)64 -> (IntegralToPointer |unsigned char *| 64) -> (unsigned char *)64

The sort of the Z3 expression would remain (_ BitVector 64) as it is currently. Uninterpreted Z3 sorts cannot be used, since we would lose Z3 bitvector semantics and I currently don't know how to add interpretation to uninterpreted Z3 sorts.

The above solution would also require building clang::CastKind -> z3::func_decl and clang::Type -> z3::func_decl mappings. Possibly clang::BuiltinType instead of general clang::Type.

Adding interpretations to Z3 casting functions like IntegralToPointer could also be possible. The default being simply returning the second argument, others could model actual C semantics (bitwise extensions, truncations, etc).

Z3 functions derived from types, such as |unsigned char *| could have an interpretation that would return their bitwidth to aid with casting semantics.

Install target for Makefile

Having an install target for rellic will help when packaging rellic to include the necessary binaries and library artifacts, so that they can be distributed in a more lightweight package.

With PR #48 , users have the ability to use Docker to build, test, and run rellic. However, the Dockerfile in that PR produces an image that is fairly large, at around 2GB, with the libraries directory (inside the build directory) accounting for 1.4GB of that total.

It would be nice to use the install target to produce smaller Docker images that only include the necessary build artifacts required to develop with/run rellic.

Unknown LLVM type in large x64 bit game lifted from McSema

Please contact me if you want the actual bitcode (it is 300mb of llvm bitcode generated by McSema!)

Output from program------

F0618 14:25:43.245985 34890 IRToASTVisitor.cpp:84] Unknown LLVM Type

*** Check failure stack trace: ***

@ 0x1b7eeed google::LogMessage::Fail()

@ 0x1b813e4 google::LogMessage::SendToLog()

@ 0x1b7e96b google::LogMessage::Flush()

@ 0x1b82009 google::LogMessageFatal::~LogMessageFatal()

@ 0x7e50b6 rellic::(anonymous namespace)::GetQualType()

GenerateAST.cpp:159 Unknown terminator instruction

I was trying to see what would happen if I decompiled some bitcode generated by Rust. I get:

F1121 05:25:09.980633  5741 GenerateAST.cpp:159] Unknown terminator instruction
*** Check failure stack trace: ***
    @           0xcc82dd  google::LogMessage::Fail()
    @           0xcca76a  google::LogMessage::SendToLog()
    @           0xcc7ced  google::LogMessage::Flush()
    @           0xccb6f9  google::LogMessageFatal::~LogMessageFatal()
    @           0x781a73  rellic::GenerateAST::CreateEdgeCond()```

Bitcode file is attached:
foo.bc.gz

Invalid operand in `IRToASTVisitor.cpp:264`

F20201123 21:49:35.359668   145 IRToASTVisitor.cpp:264] Invalid operand
*** Check failure stack trace: ***
    @          0x1b653fc  google::LogMessageFatal::~LogMessageFatal()
    @           0x7c66d4  rellic::IRToASTVisitor::GetOperandExpr()

Spec is attached.

Version output:

rellic-decomp-9.0 --version
rellic-decomp-9.0 version unknown
Commit Hash: 7bacf0bb2dceff6cf6ca4fbafc4d76eae996bb35
Commit Date: 2020-11-04 19:46:27 -0500
Last commit by: Peter Goodman [[email protected]]
Commit Subject: [Merge pull request #76 from lifting-bits/fix_compat_llvm10_llvm11]

Uncommitted changes were present during build.
Using LLVM 9.0.0

bfce8c4a-2dd5-11eb-92b6-0242ac110002_rx_message_routine_gxk6ovju.spec.gz

LLVM 9.0 support

Hi, I'm trying to decompile some rust generated bitcode to see the output's equivalent c structure. I think if rellic supported llvm 9.0, it would be possible.

I'm using noodle to test this. It's a dependency free serialization/deserialization library that is quick to compile.

cargo rustc --release -- --emit=llvm-bc will produce bitcode in ./target/release/deps/

The error in rellic is:

docker run --rm -t -i -v $(pwd):/test -w /test -u $(id -u):$(id -g) rellic-decomp-80 --input /test/target/release/deps/noodle-2a10fa4336dab943.bc --output /dev/stdout
F1128 18:38:31.484282     7 Util.cpp:89] Unable to parse module file /test/target/release/deps/noodle-2a10fa4336dab943.bc: Unknown attribute kind (60) (Producer: 'LLVM9.0.0-rust-1.39.0-stable' Reader: 'LLVM 8.0.0')
*** Check failure stack trace: ***
    @          0x1c7651d  google::LogMessage::Fail()
    @          0x1c789aa  google::LogMessage::SendToLog()
    @          0x1c75f2d  google::LogMessage::Flush()
    @          0x1c79939  google::LogMessageFatal::~LogMessageFatal()
    @           0x843d13  rellic::LoadModuleFromFile()
Aborted (core dumped)

key takeaway here: (Producer: 'LLVM9.0.0-rust-1.39.0-stable' Reader: 'LLVM 8.0.0')

obviously I'm using rellic built for llvm 8.0, and didn't really expect it to work.

if build.sh is run twice, it errors out

first it failed because ninja wasn't installed, so i called the build.sh again, resulting in:

CMake Error at cmake/vcpkg_helper.cmake:17 (message):
  Please define a path to VCPKG_ROOT.  See
  https://github.com/trailofbits/cxx-common for more details.  Or if you
  don't want to use vcpkg dependencies, add '-DUSE_SYSTEM_DEPENDENCIES=ON'
Call Stack (most recent call first):
  CMakeLists.txt:22 (include)

i then rm -rf'd the rellic-build dir, downloaded the 500 MB of stuff again, just to get

  Could NOT find Git (missing: GIT_EXECUTABLE)

(i assumed if i cloned the repo before entering the ubuntu rootfs i wouldnt need it)

running build.sh again results in the same error as above.
i guess i have to wait another hour to download the 500MB again and see what comes before the next error.

Opt-in Phi Node elimination phase

Currently rellic will (correctly! -- since they cannot be represented in C) bail on encountering LLVM Phi nodes.

We do not transform the code automatically because we do not want to make changes to the bitcode the user gave us. An error message is produced letting them know to run reg2mem, which will remove Phi nodes.

This is a reasonable solution for interactive use, but for massive testing, we should include a way for rellic to opt-in to phi node elimination.

Suggestion to have something like --remove-phi-nodes that will run a simplified Phi node eliminator prior to converting to C. (reg2mem does a few other things as well; we are only after Phi removal).

Implement VisitFunctionDecl for simplifying expressions with function calls

During simplification of expressions with function calls in the expression, the Z3 conversion visitor will fail to convert function calls properly as per: https://github.com/lifting-bits/rellic/blob/master/rellic/AST/Z3ConvVisitor.cpp#L327

I am not entirely sure as to how this should be implemented, I was thinking something along the lines of emitting an opaque Z3 interpreted function through (how do I get the parameters in there?) and then "interpreting" it as a function call right through. Does this sound like the optimal way? Open to suggestions, would be willing to implement.

rellic::CondBasedRefine::VisitCompoundStmt: SIGSEGV (Address boundary error)

I was really excited to see the release of Rellic and wanted to take it out for a spin. Minimal programs work are successfully decompiled (e.g. the LLVM IR produced by Clang 4.0 for the input C source int main() {return 42;})

However, I ran into an issue that causes an out of bounds crash.

u@x1 ~/D/r/rellic-build> ./rellic-decomp-4.0 --input input.ll --output output.c
*** Aborted at 1548759746 (unix time) try "date -d @1548759746" if you are using GNU date ***
PC: @                0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 2499 (TID 0x7f56e8afcf40) from PID 0; stack trace: ***
    @     0x7f56e8f463c0 (unknown)
    @           0x6662d3 rellic::CondBasedRefine::VisitCompoundStmt()
fish: “./rellic-decomp-4.0 --input inp…” terminated by signal SIGSEGV (Address boundary error)

Based on the following LLVM IR input:

; ModuleID = 'original.c'
source_filename = "original.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: noinline nounwind uwtable
define i32 @foo(i32, i32) #0 {
  %3 = alloca i32, align 4
  %4 = alloca i32, align 4
  %5 = alloca i32, align 4
  %6 = alloca i32, align 4
  store i32 %0, i32* %3, align 4
  store i32 %1, i32* %4, align 4
  store i32 0, i32* %5, align 4
  store i32 0, i32* %6, align 4
  br label %7

; <label>:7:                                      ; preds = %17, %2
  %8 = load i32, i32* %6, align 4
  %9 = icmp ne i32 %8, 42
  br i1 %9, label %10, label %20

; <label>:10:                                     ; preds = %7
  %11 = load i32, i32* %3, align 4
  %12 = load i32, i32* %5, align 4
  %13 = add i32 %12, %11
  store i32 %13, i32* %5, align 4
  %14 = load i32, i32* %4, align 4
  %15 = load i32, i32* %5, align 4
  %16 = urem i32 %15, %14
  store i32 %16, i32* %5, align 4
  br label %17

; <label>:17:                                     ; preds = %10
  %18 = load i32, i32* %6, align 4
  %19 = add i32 %18, 1
  store i32 %19, i32* %6, align 4
  br label %7

; <label>:20:                                     ; preds = %7
  %21 = load i32, i32* %5, align 4
  ret i32 %21
}

; Function Attrs: noinline nounwind uwtable
define i32 @main() #0 {
  %1 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  %2 = call i32 @foo(i32 10, i32 20)
  ret i32 %2
}

attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = !{!"clang version 4.0.1 (tags/RELEASE_401/final)"}

Which is produced with Clang 4.0 (the bundled version) from this C source:

unsigned int foo(unsigned int a, unsigned int b) {
	unsigned int sum = 0;
	for (unsigned int i = 0; i != 42; i++) {
		sum += a;
		sum %= b;
	}
	return sum;
}

int main() {
	return foo(10, 20);
}

The condition inside if statement could not be a function

When I trying to decompile the bc file generated by following code

#include <stdio.h>
#include <string.h>


int main () {
   char str1[20];
   char str2[20];
   int result;

   //Assigning the value to the string str1
   strcpy(str1, "hello");

   //Assigning the value to the string str2
   strcpy(str2, "helLO WORLD");

   //This will compare the first 3 characters
   if(strncmp(str1, str2, 3) > 0) {
      printf("ASCII value of first unmatched character of str1 is greater than str2");
   } else if(result < 0) {
      printf("ASCII value of first unmatched character of str1 is less than str2");
   } else {
      printf("Both the strings str1 and str2 are equal");
   }

   return 0;
}

with

$ clang -emit-llvm -c xxx -o xxx.bc
$ ./rellic-build/tools/rellic-decomp-10.0 --input xxx.bc --output xxx_generated.c

I received following error:

F0215 16:30:39.637413 27356 Z3ConvVisitor.cpp:324] Unimplemented FunctionDecl visitor11
*** Check failure stack trace: ***
    @     0x556a801557ac  google::LogMessageFatal::~LogMessageFatal()
    @     0x556a7ed39368  rellic::Z3ConvVisitor::VisitFunctionDecl()
Aborted (core dumped

I add more logs in "bool Z3ConvVisitor::VisitFunctionDecl(clang::FunctionDecl *func) " at "rellic/AST/Z3ConvVisitor.cpp" as following:

bool Z3ConvVisitor::VisitFunctionDecl(clang::FunctionDecl *func) {
  //https://clang.llvm.org/doxygen/structclang_1_1DeclarationNameInfo.html
  LOG(INFO) << "Name of function is : "<<func->getNameInfo().getName().getAsString();
  LOG(INFO) << "Declare Info is " << func->getNameInfo().getAsString();
  DLOG(INFO) << "VisitFunctionDecl";
  LOG(FATAL) << "Unimplemented FunctionDecl visitor11";
  return true;
}

And it told me that the function blocks at strncmp:

I0215 16:10:10.213157 25450 Z3ConvVisitor.cpp:320] Name of function is : strncmp
I0215 16:10:10.213160 25450 Z3ConvVisitor.cpp:321] Declare Info is strncmp
F0215 16:10:10.213161 25450 Z3ConvVisitor.cpp:324] Unimplemented FunctionDecl visitor11

Meanwhile, extract "strncmp" out of "if statement" makes everything works.

I tried to fix this but not sure where to start, is this a bug or any reason we do not want to fix this?

For the gdb info, it is here:

__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff6ec4921 in __GI_abort () at abort.c:79
#2  0x00005555572569fd in google::DumpStackTraceAndExit() ()
#3  0x000055555724eedc in google::LogMessage::SendToLog() ()
#4  0x000055555724f4ce in google::LogMessage::Flush() ()
#5  0x00005555572527ac in google::LogMessageFatal::~LogMessageFatal() ()
#6  0x0000555555e36368 in rellic::Z3ConvVisitor::VisitFunctionDecl (this=<optimized out>, 
    func=0x555558d327c0) at /home/muqi/decompile_tool/rellic/rellic/AST/Z3ConvVisitor.cpp:324
#7  0x0000555555e3ef8b in clang::RecursiveASTVisitor<rellic::Z3ConvVisitor>::WalkUpFromFunctionDecl (D=0x555558d327c0, this=<optimized out>)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/DeclNodes.inc:401
#8  rellic::Z3ConvVisitor::TraverseFunctionDecl (func=0x555558d327c0, this=<optimized out>)
    at /home/muqi/decompile_tool/rellic/rellic/AST/Z3ConvVisitor.h:91
#9  clang::RecursiveASTVisitor<rellic::Z3ConvVisitor>::TraverseDecl (this=0x555558d3a000, 
    D=0x555558d327c0)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/DeclNodes.inc:401
#10 0x0000555555e3a30b in rellic::Z3ConvVisitor::GetOrCreateZ3Decl (
    this=this@entry=0x555558d3a000, c_decl=c_decl@entry=0x555558d327c0)
    at /home/muqi/decompile_tool/rellic/rellic/AST/Z3ConvVisitor.cpp:262
#11 0x0000555555e3a40f in rellic::Z3ConvVisitor::VisitDeclRefExpr (this=0x555558d3a000, 
    c_ref=0x555558d32f68)
    at /home/muqi/decompile_tool/rellic/rellic/AST/Z3ConvVisitor.cpp:672
#12 0x0000555555e3a51c in clang::RecursiveASTVisitor<rellic::Z3ConvVisitor>::TraverseStmt (
    this=this@entry=0x555558d3a000, S=S@entry=0x555558d332e0, Queue=0x0)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/RecursiveASTVisitor.h:653
#13 0x0000555555e3a6bb in rellic::Z3ConvVisitor::GetOrCreateZ3Expr (this=0x555558d3a000, 
    c_expr=c_expr@entry=0x555558d332e0)
    at /home/muqi/decompile_tool/rellic/rellic/AST/Z3ConvVisitor.cpp:254
#14 0x0000555555e09dc3 in rellic::Z3CondSimplify::SimplifyCExpr (
    this=this@entry=0x555558d370b0, c_expr=0x555558d332e0)
    at /home/muqi/decompile_tool/rellic/rellic/AST/Z3CondSimplify.cpp:44
#15 0x0000555555e0a320 in rellic::Z3CondSimplify::VisitIfStmt (
    this=this@entry=0x555558d370b0, stmt=0x555558d33600)
    at /home/muqi/decompile_tool/rellic/rellic/AST/Z3CondSimplify.cpp:56
#16 0x0000555555e0ad99 in clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::WalkUpFromIfStmt (S=<optimized out>, this=0x555558d370d0)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/StmtNodes.inc:121
#17 clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::PostVisitStmt (
    this=this@entry=0x555558d370d0, S=<optimized out>)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/StmtNodes.inc:121
#18 0x0000555555e0a608 in clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::TraverseStmt (
    this=this@entry=0x555558d370d0, S=<optimized out>, Queue=0x0)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/RecursiveASTVisitor.h:653
#19 0x0000555555e323d0 in clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::TraverseFunctionHelper (this=this@entry=0x555558d370d0, D=D@entry=0x555558d39660)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/RecursiveASTVisitor.h:2072
---Type <return> to continue, or q <return> to quit---
#20 0x0000555555e32557 in clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::TraverseFunctionDecl (this=0x555558d370d0, D=0x555558d39660)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/RecursiveASTVisitor.h:2077
#21 0x0000555555e0a554 in clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::TraverseDeclContextHelper (this=this@entry=0x555558d370d0, DC=DC@entry=0x555558d29c00)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/RecursiveASTVisitor.h:1410
#22 0x0000555555e223aa in clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::TraverseDeclContextHelper (DC=<optimized out>, this=0x555558d370d0)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/Decl.h:105
#23 clang::RecursiveASTVisitor<rellic::Z3CondSimplify>::TraverseTranslationUnitDecl (
    this=0x555558d370d0, D=0x555558d29bd8)
    at /home/muqi/decompile_tool/lifting-bits-downloads/vcpkg_ubuntu-18.04_llvm-10_amd64/installed/x64-linux-rel/include/clang/AST/RecursiveASTVisitor.h:1511
#24 0x0000555555e0a4c1 in rellic::Z3CondSimplify::runOnModule (this=0x555558d370b0, 
    module=...) at /home/muqi/decompile_tool/rellic/rellic/AST/Z3CondSimplify.cpp:73
#25 0x0000555555f525a8 in llvm::legacy::PassManagerImpl::run(llvm::Module&) ()
#26 0x0000555555ccf1e3 in (anonymous namespace)::GeneratePseudocode (output=..., module=..., 
    this=<optimized out>) at /home/muqi/decompile_tool/rellic/tools/decomp/Decomp.cpp:95
#27 main (argc=<optimized out>, argv=<optimized out>)
    at /home/muqi/decompile_tool/rellic/tools/decomp/Decomp.cpp:202

Build error using llvm 10 on mac

During the build of rellic I get a number of compilation errors:
[ 19%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/Compat/Stmt.cpp.o [ 19%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/Compat/Expr.cpp.o [ 23%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/InferenceRule.cpp.o [ 26%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/CXXToCDecl.cpp.o [ 30%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/DeadStmtElim.cpp.o [ 34%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/CondBasedRefine.cpp.o [ 38%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/ExprCombine.cpp.o [ 42%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/GenerateAST.cpp.o [ 46%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/IRToASTVisitor.cpp.o [ 50%] Building CXX object rellic/CMakeFiles/rellic.dir/AST/LoopRefine.cpp.o /Users/brianmosher/Documents/repos/rellic/rellic/AST/IRToASTVisitor.cpp:87:57: error: too few arguments to function call, expected 5, have 4 clang::ArrayType::ArraySizeModifier::Normal, 0); ^ /usr/local/share/trailofbits/libraries/llvm/include/clang/AST/ASTContext.h:1349:3: note: 'getConstantArrayType' declared here QualType getConstantArrayType(QualType EltTy, const llvm::APInt &ArySize, ^ In file included from /Users/brianmosher/Documents/repos/rellic/rellic/AST/ExprCombine.cpp:20: /Users/brianmosher/Documents/repos/rellic/rellic/AST/ExprCombine.h:27:34: error: expected class name class ExprCombine : public llvm::ModulePass,

Examining the caller and the ASTContext.h header, there does seem to be a mismatch in the number of args and expected parameters.

rellic/rellic/AST/IRToASTVisitor.cpp:
case llvm::Type::ArrayTyID: { auto arr = llvm::cast<llvm::ArrayType>(type); auto elm = GetQualType(arr->getElementType()); result = ast_ctx.getConstantArrayType( elm, llvm::APInt(32, arr->getNumElements()), clang::ArrayType::ArraySizeModifier::Normal, 0); } break;
libraries/llvm/include/clang/AST/ASTContext.h:
QualType getConstantArrayType(QualType EltTy, const llvm::APInt &ArySize, const Expr *SizeExpr, ArrayType::ArraySizeModifier ASM, unsigned IndexTypeQuals) const;
This is with llvm-10 installed using pkgman.py from cxxx-common.

Improve handling of C++ template functions

The rellic-headergen tool should produce a functioning C header equivalent from the following code.

#include <utility>

class MyClass {
  std::pair<int, int> my_pair;
  std::pair<int, int> MyMethod(std::pair<int, int> pair);
};

So far it produces

struct pair {
    int first;
    int second;
};
void _ZNSt4pairIiiEC1Ev(struct pair *this);
void _ZNSt4pairIiiEC1ERKiS2_(struct pair *this, const int &__a, const int &__b);
struct MyClass {
    std::pair<int, int> my_pair;
};
std::pair<int, int> _ZN7MyClass8MyMethodESt4pairIiiE(struct MyClass *this, std::pair<int, int> pair);

A good way to evaluate the results is to compare the LLVM IR produced by code that uses the original C++ header and the generated C header. The shape and size of class.MyClass in the LLVM IR from both headers should be the same. Name mangling should be the same as well.

in China

[-] Library version is libraries-llvm40-ubuntu1604-amd64
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  9  258M    9 24.9M    0     0  25152      0  2:59:30  0:17:18  2:42:12     0
curl: (56) SSL read: error:00000000:lib(0):func(0):reason(0), errno 104
[x] Unable to download cxx-common build libraries-llvm40-ubuntu1604-amd64.
[x] Build aborted.

[-] Library version is libraries-llvm40-ubuntu1604-amd64
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  258M    0  645k    0     0    624      0   5d 00h  0:17:38   5d 00h     0
curl: (56) SSL read: error:00000000:lib(0):func(0):reason(0), errno 104
[x] Unable to download cxx-common build libraries-llvm40-ubuntu1604-amd64.
[x] Build aborted.

[-] Library version is libraries-llvm40-ubuntu1604-amd64
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 12  258M   12 32.6M    0     0  15420      0  4:52:47  0:37:00  4:15:47     0
curl: (56) SSL read: error:00000000:lib(0):func(0):reason(0), errno 104
[x] Unable to download cxx-common build libraries-llvm40-ubuntu1604-amd64.
[x] Build aborted.

[-] Library version is libraries-llvm40-ubuntu1604-amd64
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  258M    0     0    0     0      0      0 --:--:--  0:09:46 --:--:--     0
curl: (56) SSL read: error:00000000:lib(0):func(0):reason(0), errno 104
[x] Unable to download cxx-common build libraries-llvm40-ubuntu1604-amd64.
[x] Build aborted.

Handling LLVM PHI instructions

So far we've been dealing with phi instructions by requiring preprocessing via reg2mem. This can lead to a large number of alloca instructions and consequently into a lot of local variables in the output C of rellic. So the question is wether there is a better way to do it.

Handle `switch` lowering

We should start lowering switch, which will both enable us to generate better bitcode and fix a source of non-translation.

Output improvements: Emit #include<file> directives

It is very helpful if the output code has #include directives.

Understandably, we cannot identify every custom header that may exist.

However, we can handle the case of things in the standard OS headers for the target OS, or at least the standard C library.

The mapping does not have to be perfect; it just has to provide better quality output than what exists now.

Build error

Hi, I tring to install the rellic on ubuntu 18.04, but I met the error when running the command: ./scripts/build.sh:

FAILED: rellic/CMakeFiles/rellic.dir/AST/Util.cpp.o
/home/lb/mvm/rellic/rellic-build/libraries/llvm/bin/clang++ -DGFLAGS_IS_A_DLL=0 -DGOOGLE_GLOG_DLL_DECL="" -DNDEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -isystem . -isystem ../ -isystem libraries/llvm/include -isystem libraries/z3/include -isystem libraries/glog/include -isystem libraries/gflags/include -O2 -g -DNDEBUG -Wall -Wextra -Wno-unused-parameter -Wno-c++98-compat -Wno-unreachable-code-return -Wno-nested-anon-types -Wno-extended-offsetof -Wno-variadic-macros -Wno-return-type-c-linkage -Wno-c99-extensions -Wno-ignored-attributes -Wno-unused-local-typedef -Wno-unknown-pragmas -Wno-unknown-warning-option -fPIC -fno-omit-frame-pointer -fvisibility-inlines-hidden -fno-exceptions -fno-asynchronous-unwind-tables -Wgnu-alignof-expression -Wno-gnu-anonymous-struct -Wno-gnu-designator -Wno-gnu-zero-variadic-macro-arguments -Wno-gnu-statement-expression -gdwarf-2 -g3 -O3 -Werror -pedantic -std=c++14 -MD -MT rellic/CMakeFiles/rellic.dir/AST/Util.cpp.o -MF rellic/CMakeFiles/rellic.dir/AST/Util.cpp.o.d -o rellic/CMakeFiles/rellic.dir/AST/Util.cpp.o -c ../rellic/AST/Util.cpp
../rellic/AST/Util.cpp:206:20: error: no matching constructor for initialization of 'clang::MemberExpr'
return new (ctx) clang::MemberExpr(base, is_arrow, clang::SourceLocation(),
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libraries/llvm/include/clang/AST/Expr.h:2848:3: note: candidate constructor not viable: requires 9 arguments, but 8 were provided
MemberExpr(Expr *Base, bool IsArrow, SourceLocation OperatorLoc,
^
libraries/llvm/include/clang/AST/Expr.h:2852:3: note: candidate constructor not viable: requires single argument 'Empty', but 8 arguments were provided
MemberExpr(EmptyShell Empty)
^
libraries/llvm/include/clang/AST/Expr.h:2807:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 8 were provided
class MemberExpr final
^
1 error generated.

thanks a lot!

Test Failure on Mac with Z3

There is a failure that appears only on MacOS, as discovered in #78

The error is below and appears on LLVM versions 9, 10, 11, with Z3 version 4.8.9

Running tests...
/usr/local/Cellar/cmake/3.19.1/bin/ctest --force-new-ctest-process 
Test project /Users/runner/work/rellic/rellic/rellic-build
    Start 1: test_roundtrip
1/1 Test #1: test_roundtrip ...................***Failed    9.33 sec
.F.................
======================================================================
FAIL: test_assert (__main__.TestRoundtrip)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/runner/work/rellic/rellic/scripts/roundtrip.py", line 112, in test
    roundtrip(self, args.rellic, path, args.clang, args.timeout)
  File "/Users/runner/work/rellic/rellic/scripts/roundtrip.py", line 78, in roundtrip
    decompile(self, rellic, rt_bc, rt_c, timeout)
  File "/Users/runner/work/rellic/rellic/scripts/roundtrip.py", line 59, in decompile
    self.assertEqual(p.returncode, 0, "rellic-decomp failure: %s" % p.stderr)
AssertionError: -6 != 0 : rellic-decomp failure: F1211 15:53:30.115931 156609984 Z3ConvVisitor.cpp:65] Check failed: expr.is_bv() z3::expr is not a bitvector!
*** Check failure stack trace: ***
    @        0x1016d323f  google::LogMessageFatal::~LogMessageFatal()
    @        0x1016cf7e9  google::LogMessageFatal::~LogMessageFatal()


----------------------------------------------------------------------
Ran 19 tests in 8.666s

FAILED (failures=1)

Syntax errors due to incorrect declaration order

C output for more complex bitcode (i.e. mcsema output) has syntax errors due to the order in which function, type and global variable declarations are emitted.

Possible fixes:

  • forward declare functions and types and add definitions separately
  • try to reorder globals

Rellic crash during translation

Assertion failed: V.getBitWidth() == C.getIntWidth(type) && "Integer type is not the correct size for constant.", file C:\saturn\build_llvm\llvm-8.0.1.src\tools\clang\lib\AST\Expr.cpp, line 787
*** Aborted at 1566331860 (unix time) try "date -d @1566331860" if you are using GNU date ***
@ 0x7ffe06c3c31d raise
@ 0x7ffe06c3d321 abort
@ 0x7ffe06c3ed5e _get_wpgmptr
@ 0x7ffe06c3ec55 _get_wpgmptr
@ 0x7ffe06c3efe1 _wassert
@ 0x7ff74bce9f07 public: __cdecl clang::IntegerLiteral::IntegerLiteral(class clang::ASTContext const & __ptr64,class llvm::APInt const & __ptr64,class clang::QualType,class clang::SourceLocation) __ptr64
@ 0x7ff74bce9fd2 public: static class clang::IntegerLiteral * __ptr64 __cdecl clang::IntegerLiteral::Create(class clang::ASTContext const & __ptr64,class llvm::APInt const & __ptr64,class clang::QualType,class clang::SourceLocation)
@ 0x7ff74bc36b2c rellic::IRToASTVisitor::CreateLiteralExpr
@ 0x7ff74bc374cc rellic::IRToASTVisitor::GetOrCreateStmt
@ 0x7ff74bc372f9 rellic::IRToASTVisitor::GetOperandExpr
@ 0x7ff74bc397e1 rellic::IRToASTVisitor::visitBinaryOperator
@ 0x7ff74bc374ec rellic::IRToASTVisitor::GetOrCreateStmt
@ 0x7ff74bc2b7a4 rellic::GenerateAST::CreateBasicBlockStmts
@ 0x7ff74bc2ba0e rellic::GenerateAST::CreateRegionStmts
@ 0x7ff74bc2c6b6 rellic::GenerateAST::StructureAcyclicRegion
@ 0x7ff74bc2d7d9 rellic::GenerateAST::StructureRegion
@ 0x7ff74bc35072 std::_Func_impl_no_alloc<`lambda at C:\rellic\rellic\AST\GenerateAST.cpp:372:24',void,llvm::Region *>::_Do_call
@ 0x7ff74bc2ddc5 rellic::GenerateAST::runOnModule
@ 0x7ff74be62d51 public: bool __cdecl llvm::legacy::PassManagerImpl::run(class llvm::Module & __ptr64) __ptr64
@ 0x7ff74bcd67b8 main
@ 0x7ff74d566884 __scrt_common_main_seh
@ 0x7ffe078e7bd4 BaseThreadInitThunk
@ 0x7ffe090ece71 RtlUserThreadStart
lifted.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.