Code Monkey home page Code Monkey logo

bap's Introduction

Binary Analysis Platform

License Join the chat at https://gitter.im/BinaryAnalysisPlatform/bap docs docs

Table of contents

Overview

The Carnegie Mellon University Binary Analysis Platform (CMU BAP) is a suite of utilities and libraries that enables analysis of binary programs. BAP supports x86, x86-64, ARM, MIPS, PowerPC and new architectures can be added using plugins. BAP includes various analyses, standard interpreter, microexecution interpreter, and a symbolic executor. BAP features its own domain-specific language, Primus Lisp, that is used for implementing analyses, specifying verification conditions, modeling functions (writing stubs), and even interfacing with the SMT solver. The toolkit repository includes various examples of program analysis tools that could be implemented with BAP and can be used as the starting point (in addition to the tutorial) for implementing custom analyses. BAP can be used as a framework with a single bap utility that is extended with plugins or it can be used as a library embedded in a user application, which could be written in OCaml or, in any other language, using C bindings. We also provide some minimal support for Python to make it easier to start learning BAP.

BAP was developed in CMU, Cylab and is sponsored by grants from the United States Department of Defense, Siemens, Boeing, ForAllSecure, and the Korea government, see sponsors for more information. BAP is used in various institutions and serves as a backbone for many interesting projects, some are highlighted below:

Installation

Using pre-build packages

We provide binary packages packed for Debian and Red Hat derivatives. For other distributions we provide tgz archives. To install bap on a Debian derivative:

wget https://github.com/BinaryAnalysisPlatform/bap/releases/download/v2.5.0/{bap,libbap,libbap-dev}_2.5.0.deb
sudo dpkg -i {bap,libbap,libbap-dev}_2.5.0.deb

From sources

Our binary packages do not include the OCaml development environment. If you are going to write an analysis in OCaml you need to install BAP from the source code using either opam or by cloning and building this repository directly. The opam method is the recommended one. Once it is installed the following three commands should install the platform in a newly created switch.

opam init --comp=4.14.1   # inits opam and install the OCaml compiler
opam install bap          # installs bap and its dependencies
eval $(opam env)`         # activates opam environment

Or, if you already have a switch where you would like install bap, then just do

opam install bap

The opam install bap command will also try to install the system dependencies of BAP using your operating system package manager. If it fails due to a missing system dependency, try installing it manually and then repeat the opam install bap command. If it still doesn't work, do not hesitate to drop by our chat and seek help there. It is manned with friendly people that will be happy to help.

The instruction above will get you the latest stable release of BAP. If you're interested in our rolling releases, which are automatically updated every time a commit to the master branch happens, then you can create a new switch that uses our testing repository

opam switch create bap-testing --repos \
    default,bap=git+https://github.com/BinaryAnalysisPlatform/opam-repository#testing 4.14.1
opam install bap

After it is added, the bap repository will take precedence over the stable repository and you will get the freshly picked BAP packages straight from the farm.

If you want to build BAP manually or just want to tackle with BAP internals, then you can clone this repository and build it manually. We suggest your starting with a fresh environment without BAP being installed, to prevent clashes, or even better to use a local switch, e.g.,

git clone [email protected]:BinaryAnalysisPlatform/bap.git && cd bap
opam switch create . --deps-only
dune build && dune install

The snippet above will clone bap, create a fresh local switch, install the necessary dependencies, including the system one, and, finally, build and install bap with dune. Alternatively, if you already have a switch where you want to build and install bap, you can use

git clone [email protected]:BinaryAnalysisPlatform/bap.git && cd bap
opam install . --deps-only
dune build && dune install

to install bap and its dependencies into the currently selected switch.

Using

BAP, like Docker or Git, is driven by a single command-line utility called bap. Just type bap in your shell and it will print a message which shows BAP capabilities. The disassemble command will take a binary program, disassemble it, lift it into the intermediate architecture agnostic representation, build a control flow graph, and finally apply staged user-defined analysis in a form of disassembling passes. Finally, the --dump option (-d in short) will output the resulting program in the specified format. This is the default command, so you don't even need to specify it, e.g., the following will disassembled and dump the /bin/echo binary on your machine:

bap /bin/echo -d

Note, that unlike objdump this command will build the control flow graph of a program. If you just want to dump each instruction of a binary one after another (the so-called linear sweep disassembler mode), then you can use the objdump command, e.g.,

bap objdump /bin/echo --show-{insn=asm,bil}

If your input is a blob of machine code, not an executable, then you can use the raw loader, e.g.,

bap objdump /bin/echo --loader=raw --raw-base=0x400000 --show-{insn=asm,bil}

The raw loader takes a few parameters, like offsets, lengths, and base addresses, which makes it a swiss-knife that you can use as a can opener for formats that are not known to BAP. The raw loader works for all commands that open files, e.g., if the raw loader is used together with the disassemble command, BAP will still automatically identify function starts and build a suitable CFG without even knowing where the code is in the binary,

bap /bin/echo --loader=raw --raw-base=0x400000 -d

If you would like to play manually with bytes, e.g., type the instruction encoding manually and see how BAP disassembles it and what semantics it has, then mc is the command you're looking for. It is named for the corresponding utility in LLVM and stands for machine code and has the same interface as the objdump command except that it takes an ASCII encoding of instruction instead of a binary file, e.g.,

bap mc --show-{insn=asm,bil} -- 48 83 ec 08

or

bap mc --show-{insn=asm,bil} "\x48\x83\xec\x08"

It recognizes a few input formats (including llvm-mc is using for its -show-encoding option). Consult the documentation for more detailed information.

Extending

Writing your own analysis

BAP is a plugin-based framework and if you want to develop a new analysis you can write a plugin, build it, install, and it will work with the rest of the BAP without any recompilation. There are many extension points that you could use to add new analysis, change existing, or even build your own applications. We will start with a simple example, that registers a disassembling pass to the disassemble command. Suppose that we want to write an analysis that estimates the ratio of jump instructions to the total number of instructions in the binary. We will start by creating an empty file named jmp.ml in an empty folder (the folder name doesn't matter). Next, using our favorite text editor we will put the following code into it:

open Core_kernel
open Bap_main
open Bap.Std

let counter = object
  inherit [int * int] Term.visitor
  method! enter_term _ _ (jmps,total) = jmps,total+1
  method! enter_jmp _ (jmps,total) = jmps+1,total
end

let main proj =
  let jmps,total = counter#run (Project.program proj) (0,0) in
  printf "ratio = %d/%d = %g\n" jmps total (float jmps /. float total)

let () = Extension.declare @@ fun _ctxt ->
   Project.register_pass' main;
   Ok ()

Now we can build, install, and run our analysis using the following commands:

bapbuild jmp.plugin
bapbundle install jmp.plugin
bap /bin/echo --pass=jmp

Let's briefly go through the code. The counter object is a visitor that has the state consisting of a pair of counters. The first counter keeps track of the number of jmp terms, and the second counter is incremented every time we enter any term. The main function just runs the counter and prints the output. We declare our extension use the Extension.declare function from the Bap_main library. An extension is just a function that receives the context (which could be used to obtain configuration parameters). In this function, we register our main function as a pass using the Project.register_pass function.

A little bit more complex example, as well as an example that uses Python, can be found in our tutorial.

Building a plugin with dune

You can also build and install bap plugins using dune. For that, you need to define a library and use the plugin stanza that uses this library. Below is the template dune file,

(library
 (name FOO)
 (public_name OUR-FOO.plugin)
 (libraries bap bap-main))

(plugin
 (name FOO)
 (package OUR-FOO)
 (libraries OUR-FOO.plugin)
 (site (bap-common plugins)))

Eveything that is capitalized in the above snippet is a placeholder that you shall substitute with appropriate private and public names for your plugin. Notice, that the .plugin extension is not necessary, but is percieved as a good convention.

Interactive REPL

BAP also ships an interactive toplevel utility baptop. This is a shell-like utility that interactively evaluates OCaml expressions and prints their values. It will load BAP libraries and initialize all plugins for you, so you can interactively explore the vast world of BAP. The baptop utility can also serve as a non-interactive interpreter, so that you can run your OCaml scripts, e.g., baptop myscript.ml or you can even specify it using sha-bang at the top of your file, e.g., #!/usr/bin/env baptop. We built baptop using UTop, but you can easily use any other OCaml toplevel, including ocaml itself, just load the bap.top library, e.g., for vanilla ocaml toplevel use the following directives

#use "topfind";;
#require "bap.top";;

Learning

We understand that BAP is huge and it is easy to get lost. We're working constantly on improving documentation ensuring that every single function in BAP API is thoroughly documented. But writing higher-level guidelines in the form of manuals or tutorials is much harder and very time consuming, especially given how different the goals of our fellow researchers and users. Therefore we employ a backward-chaining approach and prefer to answer real questions rather than prematurely trying to address all possible questions. We will be happy to see you in your chat that features searchable, indexed by Google, archive.

We are writing, occasionally, to our blog and wiki and are encouraging everyone to contribute to both of them. You can also post your questions on stackoverflow or discuss BAP on the OCaml board. We also have a cute discord channel, which has much less traffic than our gitter.

Contributing

BAP is built by the community and we're welcome all contributions from authors that are willing to share them under the MIT license. If you don't think that your analysis or tool suits this repository (e.g., it has a limited use, not fully ready, doesn't meet our standards, etc), then you can consider contributing to our bap-plugins repository that is a collection of useful BAP plugins that are not mature enough to be included in the main distribution. Alternatively, you can consider extending our toolkit with your tool.

Of course, there is no need to submit your work to one of our repositories. BAP is a plugin-based framework and your code could be hosted anywhere and have any license (including proprietary). If you want to make your work available to the community it would be a good idea to release it via opam.

Sponsors

  • ForAllSecure

  • Boeing

  • DARPA VET Project

  • Siemens AG

  • Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (No.2015-0-00565, Development of Vulnerability Discovery Technologies for IoT Software Security)

Please, contact us if you would like to become a sponsor or are seeking a deeper collaboration.

bap's People

Contributors

a-benlolo avatar aoidos-bin avatar bmourad01 avatar dbrumley avatar ddcc avatar dijamner avatar drvink avatar ethan42 avatar feseal avatar gitoleg avatar gitter-badger avatar heyitsanthony avatar hirrolot avatar ivg avatar jaybosamiya avatar kennethadammiller avatar maurer avatar maverickwoo avatar murmour avatar parkre avatar percontation avatar philzook58 avatar phosphorus15 avatar pranjalsingh008 avatar rabidcicada avatar rvantonder avatar smorimoto avatar stephengroat avatar thestr4ng3r avatar xvilka avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bap's Issues

bap-server cache eviction is not implemented

currently we store all memory chunks, provided by our clients. But since we're using memory mapping, they don't cost us very much. Unlike qira, bap was never killed my oom. But... sooner or later it will happen.

Disassembler output inconsistent hex/decimal use

readbin outputs addresses in hex, but immediate operands in decimal. Example:
readbin /Users/dbrumley/git/GitHub/binaryanalysisplatform/x86-binaries/elf/binutils/gcc_binutils_32_O0_a
readbin:
....
0804C7D9 (SUB32ri8 ESP ESP 48) | subl $48, %es
p

objdump:
...
804c7d9: 83 ec 30 sub $0x30,%esp

Also, it would be good to standardize to lower case.

piqi always required

Even if we do not specify --enable-serialization piqi is still required, and compilation breaks with:

Exception: PropList.Not_set ("piqic_path", None)

P.S. this piqi produces more problems than solves

warnings in oasis

oasis outputs to many warnings that doesn't have any connections with real problems, like warning without piqi module that it can't find. Looks like that everybody is scared by them. It would be nice, to address them, or maybe just to hide.

make reinstall may error

On OS X, make reinstall fails. It seems reinstall runs uninstall then install. However, install isn't removing bap from opam.

davids-air-5:bap dbrumley$ make reinstall
ocaml setup.ml -reinstall
W: Nothing to install for findlib library 'types_test'
W: Nothing to install for findlib library 'image_test'
W: Nothing to install for findlib library 'dwarf_test'
ocamlfind: Package bap is already installed

  • (file /Users/dbrumley/.opam/401bap/lib/bap/META already exists)
    E: Failure("Command ''/Users/dbrumley/.opam/401bap/bin/ocamlfind' install bap lib/bap/META _build/lib/bap_dwarf/bap_dwarf.cmx _build/lib/bap_dwarf/dwarf_data.cmx _build/lib/bap_dwarf/dwarf_fbi.cmx _build/lib/bap_dwarf/dwarf_input.cmx _build/lib/bap_dwarf/dwarf_leb128.cmx _build/lib/bap_dwarf/dwarf_types.cmx _build/lib/bap_dwarf/bap_dwarf.cmi _build/lib/bap_dwarf/dwarf_data.cmi _build/lib/bap_dwarf/dwarf_fbi.cmi _build/lib/bap_dwarf/dwarf_input.cmi _build/lib/bap_dwarf/dwarf_leb128.cmi _build/lib/bap_dwarf/dwarf_types.cmi _build/lib/bap_dwarf/dwarf.cmxs _build/lib/bap_dwarf/dwarf.a _build/lib/bap_dwarf/dwarf.cmxa _build/lib/bap_dwarf/dwarf.cma lib/bap_dwarf/dwarf_types.ml lib/bap_dwarf/dwarf_leb128.mli lib/bap_dwarf/dwarf_input.mli lib/bap_dwarf/dwarf_fbi.mli lib/bap_dwarf/dwarf_data.mli lib/bap_dwarf/bap_dwarf.ml _build/lib/bap_elf/bap_elf.cmx _build/lib/bap_elf/elf_parse.cmx _build/lib/bap_elf/elf_types.cmx _build/lib/bap_elf/bap_elf.cmi _build/lib/bap_elf/elf_parse.cmi _build/lib/bap_elf/elf_types.cmi _build/lib/bap_elf/elf.cmxs _build/lib/bap_elf/elf.a _build/lib/bap_elf/elf.cmxa _build/lib/bap_elf/elf.cma lib/bap_elf/elf_types.ml lib/bap_elf/elf_parse.mli lib/bap_elf/bap_elf.ml _build/lib/bap_image/image_elf.cmx _build/lib/bap_image/image_elf.cmi _build/lib/bap_image/elf_backend.cmxs _build/lib/bap_image/elf_backend.a _build/lib/bap_image/elf_backend.cmxa _build/lib/bap_image/elf_backend.cma lib/bap_image/image_elf.mli _build/lib/bap_image/bap_image.cmx _build/lib/bap_image/image_backend.cmx _build/lib/bap_image/image_types.cmx _build/lib/bap_image/bap_image.cmi _build/lib/bap_image/image_backend.cmi _build/lib/bap_image/image_types.cmi _build/lib/bap_image/bap_image.cmxs _build/lib/bap_image/bap_image.a _build/lib/bap_image/bap_image.cmxa _build/lib/bap_image/bap_image.cma lib/bap_image/image_types.ml lib/bap_image/image_backend.ml lib/bap_image/bap_image.mli _build/lib/bap/bap_install_printers.cmx _build/lib/bap/bap_install_printers.cmi _build/lib/bap/top.cmxs _build/lib/bap/top.a _build/lib/bap/top.cmxa _build/lib/bap/top.cma lib/bap/bap_install_printers.ml _build/lib/bap_types/conceval.cmx _build/lib/bap_types/conceval.cmi _build/lib/bap_types/conceval.cmxs _build/lib/bap_types/conceval.a _build/lib/bap_types/conceval.cmxa _build/lib/bap_types/conceval.cma lib/bap_types/conceval.mli _build/lib/bap_types/bap_types.cmx _build/lib/bap_types/bap_addr.cmx _build/lib/bap_types/bap_arch.cmx _build/lib/bap_types/bap_bil.cmx _build/lib/bap_types/bap_bitvector.cmx _build/lib/bap_types/bap_common.cmx _build/lib/bap_types/bap_exp.cmx _build/lib/bap_types/bap_integer.cmx _build/lib/bap_types/bap_integer_intf.cmx _build/lib/bap_types/bap_regular.cmx _build/lib/bap_types/bap_size.cmx _build/lib/bap_types/bap_stmt.cmx _build/lib/bap_types/bap_type.cmx _build/lib/bap_types/bap_var.cmx _build/lib/bap_types/bap_types.cmi _build/lib/bap_types/bap_addr.cmi _build/lib/bap_types/bap_arch.cmi _build/lib/bap_types/bap_bil.cmi _build/lib/bap_types/bap_bitvector.cmi _build/lib/bap_types/bap_common.cmi _build/lib/bap_types/bap_exp.cmi _build/lib/bap_types/bap_integer.cmi _build/lib/bap_types/bap_integer_intf.cmi _build/lib/bap_types/bap_regular.cmi _build/lib/bap_types/bap_size.cmi _build/lib/bap_types/bap_stmt.cmi _build/lib/bap_types/bap_type.cmi _build/lib/bap_types/bap_var.cmi _build/lib/bap_types/types.cmxs _build/lib/bap_types/types.a _build/lib/bap_types/types.cmxa _build/lib/bap_types/types.cma lib/bap_types/bap_var.mli lib/bap_types/bap_type.mli lib/bap_types/bap_stmt.mli lib/bap_types/bap_size.mli lib/bap_types/bap_regular.mli lib/bap_types/bap_integer_intf.ml lib/bap_types/bap_integer.mli lib/bap_types/bap_exp.mli lib/bap_types/bap_common.ml lib/bap_types/bap_bitvector.mli lib/bap_types/bap_bil.ml lib/bap_types/bap_arch.mli lib/bap_types/bap_addr.mli lib/bap_types/bap_types.ml _build/lib/bap/bap.cmx _build/lib/bap/bap_plugin.cmx _build/lib/bap/bap_plugins.cmx _build/lib/bap/bap.cmi _build/lib/bap/bap_plugin.cmi _build/lib/bap/bap_plugins.cmi _build/lib/bap/bap.cmxs _build/lib/bap/bap.a _build/lib/bap/bap.cmxa _build/lib/bap/bap.cma lib/bap/bap_plugins.mli lib/bap/bap_plugin.mli lib/bap/bap.ml' terminated with error code 2")
    make: *** [reinstall] Error 1

disassembly should be run in a separate thread

Bap-server has a pool of disassemblers, but the all ran in one thread.
not sure that it will give a significant speed up, especially for short runs, but we can try. It can be done either by using preemptive threads or lwt jobs.

Program crashes on Mac OS X

This is a known issue on which any one can hit, so I will post it here as a precaution.

If llvm library is compiled with new libc++ library, and bap is compiled with GNU g++ with libstd++ library, then we will crash, since this libraries and compilers are not very ABI compatible. The table below summarizes the expected (and tested behavior):

bap llvm result
gcc/stdc++ clang/c++ 👎
gcc/c++ clang/c++ 👍
gcc/stdc++ clang/stdc++ 👍
clang/stdc++ clang/stdc++ 👍
clang/stdc++ clang/c++ 👍
clang/c++ clang/c++ 👍

Bap's configuration scripts will try to evade this problems and set the proper flags depending on assumptions, that on linux we use libstdc++, while on mac os x libc++ is used by default. So, for the end user this should just work. But if you're playing with --with-cxx options and with-cxxlibs flags, then make sure that they're coherent with llvm.

add more achitectures

Currently bap supports only arm, x86 and x86_64 architectures. But since it uses LLVM it can provide some level of support for much more architectures.

We should extend Arch module in a Bap_types library with more architectures, so that readbin and bap-mc can work on them.

Wiki update for developing against bap

For a bap developer setup (built from github), it is easier to install the right dependencies by doing:

$ opam install bap --deps-only

The reasoning here is that, if you are interested in developing specifically against bap , this will install the exact dependencies it is built on, so you don't have to worry about opam upgrade. It can replace the following on the wiki:

Installing OCaml dependencies

The easiest way to install the OCaml dependencies of bap is to use
the opam package manager:

$ opam install $(cat opam.deps)

If you are using a development version, e.g., you have just cloned this from
github, then you will also need the oasis package in order to create a build
environment.

$ opam install oasis

different memory references in ARM lifter

The following code:

  begin(dired_dump_obstack_ENTRY) 
      0000a48c: 00 48 2d e9    push {r11, lr}      ; STMDB_UPD(SP,SP,0xe,Nil,R11,LR)
      0000a490: 04 b0 8d e2    add r11, sp, #0x4   ; ADDri(R11,SP,0x4,0xe,Nil,Nil)
      0000a494: 20 d0 4d e2    sub sp, sp, #0x20   ; SUBri(SP,SP,0x20,0xe,Nil,Nil)
      0000a498: 20 00 0b e5    str r0, [r11, #-32] ; STRi12(R0,R11,0xffffffe0,0xe,Nil)
      0000a49c: 24 10 0b e5    str r1, [r11, #-36] ; STRi12(R1,R11,0xffffffdc,0xe,Nil)
      0000a4a0: 24 30 1b e5    ldr r3, [r11, #-36] ; LDRi12(R3,R11,0xffffffdc,0xe,Nil)
      0000a4a4: 0c 30 0b e5    str r3, [r11, #-12] ; STRi12(R3,R11,0xfffffff4,0xe,Nil)
      0000a4a8: 0c 30 1b e5    ldr r3, [r11, #-12] ; LDRi12(R3,R11,0xfffffff4,0xe,Nil)
      0000a4ac: 0c 30 93 e5    ldr r3, [r3, #12]   ; LDRi12(R3,R3,0xc,0xe,Nil)
      0000a4b0: 03 20 a0 e1    mov r2, r3          ; MOVr(R2,R3,0xe,Nil,Nil)
      0000a4b4: 0c 30 1b e5    ldr r3, [r11, #-12] ; LDRi12(R3,R11,0xfffffff4,0xe,Nil)
      0000a4b8: 08 30 93 e5    ldr r3, [r3, #8]    ; LDRi12(R3,R3,0x8,0xe,Nil)
      0000a4bc: 02 30 63 e0    rsb r3, r3, r2      ; RSBrr(R3,R3,R2,0xe,Nil,Nil)
      0000a4c0: 23 31 a0 e1    lsr r3, r3, #2      ; MOVsi(R3,R3,0x13,0xe,Nil,Nil)
      0000a4c4: 10 30 0b e5    str r3, [r11, #-16] ; STRi12(R3,R11,0xfffffff0,0xe,Nil)
      0000a4c8: 10 30 1b e5    ldr r3, [r11, #-16] ; LDRi12(R3,R11,0xfffffff0,0xe,Nil)
      0000a4cc: 00 00 53 e3    cmp r3, #0x0        ; CMPri(R3,0x0,0xe,Nil)
      0000a4d0: 4b 00 00 0a    beq #0x12c          ; Bcc(0x12c,0x0,CPSR)
  end(dired_dump_obstack_ENTRY)

Outputs BIL that references the memory using plenty of variables, name, like mem, m2, src, although all loads and stores should have side effects only on one memory, namely the machine memory.

  begin(dired_dump_obstack_ENTRY) {
    orig_base_3161 := SP
    mem := mem with [orig_base_3161 + 0xFFFFFFFC:32, el]:u32 <- LR
    mem := mem with [orig_base_3161 + 0xFFFFFFF8:32, el]:u32 <- R11
    SP := SP - 0x8:32
    R11 := SP + 0x4:32
    SP := SP - 0x20:32
    m1 := m2 with [R11 + 0xFFFFFFE0:32, el]:u32 <- R0
    m1 := m2 with [R11 + 0xFFFFFFDC:32, el]:u32 <- R1
    R3 := src[R11 + 0xFFFFFFDC:32, el]:u32
    m1 := m2 with [R11 + 0xFFFFFFF4:32, el]:u32 <- R3
    R3 := src[R11 + 0xFFFFFFF4:32, el]:u32
    R3 := src[R3 + 0xC:32, el]:u32
    R2 := R3
    R3 := src[R11 + 0xFFFFFFF4:32, el]:u32
    R3 := src[R3 + 0x8:32, el]:u32
    R3 := R2 - R3
    unshifted_3203 := R3
    R3 := unshifted_3203 >> 0x2:32
    m1 := m2 with [R11 + 0xFFFFFFF0:32, el]:u32 <- R3
    R3 := src[R11 + 0xFFFFFFF0:32, el]:u32
    orig1_3213 := R3
    orig2_3214 := 0x0:32
    dest_3211 := R3 - 0x0:32
    CF := orig2_3214 <= orig1_3213
    VF := high:1[(orig1_3213 ^ orig2_3214) & (orig1_3213 ^ dest_3211)]
    NF := high:1[dest_3211]
    ZF := dest_3211 = 0x0:32
    if (ZF = true) {
      jmp dired_dump_obstack_0x178
    }
  }

Documentation for bap

Hi,

Can I get documentation on how to install and use BAP and which version of ocaml and opam to be used. I am facing problems with version numbers of ocaml and opam. If I run "./configure" in bap folder, then I get the following error.

./configure
File "preconfig.ml", line 24, characters 14-27:
Warning 3: deprecated: String.create
Use Bytes.create instead.
File "preconfig.ml", line 29, characters 12-25:
Warning 3: deprecated: String.create
Use Bytes.create instead.
Exception: Sys_error "myocamlbuild.ml: No such file or directory".

Current version of ocaml - 4.02.1
opam - 1.2.0

If I use ocaml 4.00.1, the I get unbound value |> error.

Thank you.

pack to opam

bap will be much easier to install with opam install bap. Also, packing into opam will give us an access to mac os x testing.

opam

The opam file that comes with bap is unreliable. It would be nice to
give 'opam install bap' an option to specify where to find llvm-config.
I don't know how to do this.

Another issue, is that 'opam remove' tries to run ./configure, which
can fail miserably and leave the .opam directory in a messy state.
This is annoying because opam will automatically attempt to remove
bap (and then recompile it) when you install something else (like utop).

It would be more robust to change the opam file as follows:

remove: [
["ocamlfind" "remove" "bap"]
["ocamlfind" "remove" "core_lwt"]
["rm" "-f" "%{bin}%/bap-mc" "%{bin}%/bap-server" "%{bin}%/baptop"
"%{bin}%/bapbuild" "%{bin}%/train" "%{bin}%/readbin"
"%{bin}%/fbi" "%{bin}%/byteweight"]
]

Bruno

insn printer requires tabulation

Currently instruction printer requires tabulation with proper tab stop set up, this should be either rewritten without tabulation, or something clever should be done, otherwise printing is broken

integrating byteweight into bap.

So, we have byteweight merged into bap, but there're few issues, we need to discuss. We should understand, that currently it is mostly not a part of bap, but more a demo application. That's not bad, but it is not enough.
What we should do next, is to split it into a library/application parts. So that we can grab some neat stuff from byteweight, so that it can be used inside bap itself. Also, we need to make a plugin of byteweight. But before doing this we should figure out what kind of service does it provide. Currently in BAP there is only one service named bap.image that provides facilities to load and parse binary files. So it is time to add new service. Now we should try to figure out an interface of the service. Indeed, we need to figure out two interfaces, one for backend (i.e., service provider) and other for the frontend (service itself) (cf., elf_backend and image). So, lets start from the frontend. Two variants came to my mind: something like function start identifier (FSI) or function boundaries identifier (FBI). Currently, only dwarf can provide the latter. But since dwarf can be used in real conditions we can forget about it. Also we have elf itself, that can provide some useful information even for stripped binary. But afaik it can also provide only function starts (correct me if I'm wrong, but all the we can rely is dynsym table coupled with relocation table, and they give us only starting locations). So, my idea is, instead of starting with FBI and then downcasting it to FSI we should start with the latter. Another question is symbol names. I thing that function boundaries and function names are orthogonal ideas, and shouldn't be mixed. It would be a better idea to have a separate service, that will resolve names. So back to FSI. What this service actually can provide is the predicate over binary, that marks certain addresses as starts of functions, that gives us image -> addr seq or mem -> arch -> addr seq. The problem with this interfaces, is that it doesn't grant any access to file metainformation, so we can't implement any providers, that rely on this (like dwarf, or elf). That means, that FSI backend should work on a lower level, it should work directly with file, so we came out with Bigstring.t -> arch -> addr seq. Also, having in mind some other possible backend implementations, like based on llvm code, we can make it even a little bit more low-level:
Bigstring.t -> arch -> addr -> bool. So, I'm eager to hear others. Everyone is welcome.

static linking with llvm

It's seems that there aren't dynamic libraries on debian installed by default, so may be it's better to provide us with static linking ?

automate byteweight test

bap-byteweight has all the needed stuff on the board, it can output symbols found in the symbol table with bap-byteweight symbols file, as well as output symbols found with byteweight bap-byteweight find file. Given this, we can build a test suite, that will check, that we have zero false negatives.

CI script misses `baptop` errors

Travis system didn't fire an error, when baptop broke, here is the excerpt from the "green" build:

+./test.sh coreutils
�[0mPackage requires itself: bap
�[0mFile "bil.ml", line 1, characters 0-12:
Error: Unbound module Bap

bap executables depend on opam

Since we're looking for plugins using compiled in reference to opam folder, bap executables can't be shipped as is. We need to figure out a best way to ship bap executables as a bundle, with plugin system configured to search in a plugin folder.

warnings in OCaml 4.02

Build system produces a few nasty warnings in OCaml 4.02

  1. preconf.ml uses String.create, and OCaml 4.02 asks us to use Bytes that we can't use since we're
    bootstrapping environment. Maybe using String.make will help. Otherwise we can move
    toBigarrays
  2. Some nasty warning about unused tags in a tags file. Can be easily fixed with a few keystrokes
    added to myocamlbuild.ml.in.

BAP versioning number inconsistency

The BAP oasis file currently says version 0.2. The previous BAP (before moving to github) went up through 0.8. Therefore, it does not seem to make sense to call this 0.2.

The original idea was to call this series BAP 1.x, as it is not backward compatible. There is a serious con here: 1.x might imply stability as opposed to newness.

We need a new name. I'm open to better solutions than 1.x. I would prefer 0.9 myself, and just say it's completely backward incompatible.

`Disasm.Basic.insn_of_mem` should return consumed memory

At present, the first coordinate of the returned value of Disasm.Basic.insn_of_mem is the input memory. The doc says it returns the consumed memory (and this is indeed what a user would wish to know).

Example baptop inputs:

open Core_kernel.Std;;
open Or_error;;
open Bap.Std;;
let disassembler = Disasm.Basic.create ~backend:"llvm" "x86_64";;
let bigstr = Bigstring.of_string @@ String.init ~f:(fun _ -> '\x90') 100;;
let base = Addr.of_int64 0L;;
let mem = Memory.create LittleEndian base bigstr;;
let memok = ok_exn mem;;
let x = disassembler >>= fun d -> Disasm.Basic.insn_of_mem d memok;;
let a, b, c = ok_exn x;;
let memory_size mem = Memory.to_buffer mem |> Bigsubstring.length;;
let _ = printf "%d bytes to encode %S\n"
    (memory_size a)
    (Disasm.Basic.Insn.asm @@ uw b);;

Output:

100 bytes to encode "\tnop"

link_exn from Image.section_of_symbol

To reproduce on Trusty with 76c7cf2:

$ cat a.c
#include<stdio.h>
int main() {
  printf("hello world\n");
}
$ gcc -g a.c
$ baptop

and then:

# let open Core_kernel.Std in let open Or_error in let open Bap.Std in
Image.create "a.out" >>= fun (image, errs) ->
  printf "%d\n" @@ List.length errs;
  return @@ Table.iteri (Image.symbols image) ~f:(fun m s ->
    Image.section_of_symbol image s |> Fn.ignore);;
0
Exception:
("link_exn: unbound value "
 ((name main) (is_function true) (is_debug true) (locations (((addr ((z 0x40052D) (w 64) (signed false))) (len 16)) ())))).

bap-mc should be moved into readbin

It can and should reuse most of readbin facilities, like pretty printing, optimizations, etc. I think that it can even reuse its command-line interface.

BAP Python API

In order to make BAP usable to the majority of people doing binary analysis today, it needs to have a functional, stable, and documented Python API.

README is outdated

BAP currently has much more requirements. Also it would be a good idea to advertise python bindings

incorrect jump decoding in ARM

In the following code, emitted from 00 88 bd e8 pop {r11, pc} the last statement should be before the jump:

  begin(emit_mandatory_arg_note_0x28) {
    orig_base_2605 <- SP
    R11 <- mem[orig_base_2605 + 0x0:32, el]:u32
    jmp mem[orig_base_2605 + 0x4:32, el]:u32
    SP <- SP + 0x8:32
  }

APT Dependencies Incomplete for Ubuntu 14.04

I started with the official Ubuntu Trusty 64 Vagrant box and followed the instructions in the BAP README. I found that the list of system package dependencies was incomplete.

Installing OPAM from the Ubuntu package manager did not install ocaml-native-compilers. The BAP installation failed while building the ocamlfind dependency and I had to revert OPAM and begin the installation again. Additionally I needed the m4 package for a different BAP dependency. Unfortunately I didn't write down which one in my notes, but I had to revert OPAM and begin the BAP installation again.

The last error I had before I gave up:

===== ERROR while installing ssl.0.4.7 =====
# opam-version 1.1.1
# os           linux
# command      ./configure --prefix /home/vagrant/.opam/system
# path         /home/vagrant/.opam/system/build/ssl.0.4.7
# compiler     system (4.01.0)
# exit-code    1
# env-file     /home/vagrant/.opam/system/build/ssl.0.4.7/ssl-21466-427793.env
# stdout-file  /home/vagrant/.opam/system/build/ssl.0.4.7/ssl-21466-427793.out
# stderr-file  /home/vagrant/.opam/system/build/ssl.0.4.7/ssl-21466-427793.err
### stdout ###
# ...[truncated]
# checking for ocamldep... /usr/bin/ocamldep
# checking for ocamllex... /usr/bin/ocamllex
# checking for ocamlyacc... /usr/bin/ocamlyacc
# checking for ocamldoc... /usr/bin/ocamldoc
# checking for ocamlmktop... /usr/bin/ocamlmktop
# checking for gcc... (cached) gcc
# checking whether we are using the GNU C compiler... (cached) yes
# checking whether gcc accepts -g... (cached) yes
# checking for gcc option to accept ISO C89... (cached) none needed
# checking for SSL_new in -lssl... no
### stderr ###

One last note that may be worth mentioning is to recommend users in the README run opam update before the installation. The installation instructions don't say to do that, and it took me a while to figure out why BAP was an "invalid package name." The corresponding apt-get message is Unable to locate package.

reimplement elf parser

Current parser uses bitstring library, that requires us to convert the whole binary to string and load it to memory.

Install instructions don't work on mac.

Here is the output on my mac:
$ ./configure --prefix=$(opam config var prefix)
Exception:
Failure
"Unable to load environment, the file '/Users/dbrumley/git/GitHub/binaryanalysisplatform/bap/setup.data' doesn't exist.".
$ opam config var prefix
/Users/dbrumley/.opam/401bap

As an FYI:
$ oasis setup
W: Cannot find source file matching module 'Stmt_piqi' in library serialization
W: Cannot find source file matching module 'Stmt_piqi_ext' in library serialization

invalid argument on certain arm binaries

ARM lifter fails while lifting certain instructions. A place of failure is all the same:
bap_disasm_arm_mem_shift.ml:49:23: got register instead of imm, where in the place of offset we have a register.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.