Code Monkey home page Code Monkey logo

udis86's People

Contributors

andersk avatar bjoernd avatar brendanlong avatar doomhammer avatar ebfe avatar falconkirtaran avatar felipensp avatar ghghost avatar jamieiles avatar justinstenning avatar mbarbu avatar radare avatar sbasalaev avatar serval2412 avatar sgraf812 avatar stephenfewer avatar turboencabulator avatar vegard avatar vmt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

udis86's Issues

Warnings in syn.c

There are some warnings in syn.c, could be fixed with a trivial patch.

CC [M]  syn.o
syn.c: In function ‘ud_syn_print_addr’:
syn.c:140:9: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t’ [-Wformat]
syn.c:147:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c: In function ‘ud_syn_print_imm’:
syn.c:174:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c: In function ‘ud_syn_print_mem_disp’:
syn.c:192:5: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c:203:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘int64_t’ [-Wformat]
syn.c:205:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘int64_t’ [-Wformat]

Get udis86 into Ubuntu

Hello, maintainer of the Ruby FFI bindings for udis86 here. I want to automate the testing of my bindings on Travis CI, which uses git hooks to run tests in a Ubuntu 12.04 VM. Unfortunately, Ubuntu appears to not have any udis86 packages. We should try to get udis86 into the Ubuntu package repository.

Tag new releases

Hello!
I'd like to include your program in MacPorts. It would be much easier for me to maintain the port if you could tag new releases. Thanks for your work!

missing sign in udis86 operand value

Hello!
From udis86 disassembler I need to get such information as

ulong adrconst; // Constant part of address

But in udis86 I can't find a similar variable. So I try to find out it analyzing the fields of udis object structure.

00000000: 8945EC mov [di][-014],ax

When I try this instruction in OllyDbg dissambler I get
adrconst (constant part of address) = 0xffffffec

But in udis86 object structure I see:
mnemonic = UD_Imov
operand type = UD_OP_MEM
operand value (sdword) = 0x000000ec

So why do I get 0xEC instead of oxFFFFFFEC ? How can I determine that in this instruction we are working with negative value? (-014)

I hope that I explained the problem clearly. I will be grateful for any answers!

API changed between 1.7 and 1.7.1

The layout of ud_operand and ud have changed.

Are those part of the public API? Seems like they shouldn't change in minor point releases?

Is there a way to get the version of ud so a client library could determine which layout to use when using it from another language via an FFI bridge approach?

outsq shouldn't exist

The Intel documentation says explicitly that the 'outs' instruction is not affected by REX.W, so it never has a 64-bit operand.

ud_insn_mnemonic() function

Documentation points to an inexistent function called ud_insn_mnemonic(), has it been removed or never existed at all?

FSTSW/FNSTSW store x87 FPU instruction bad decode

udis86 doesn't do proper decoding for most if not all "fstsw" a "fstcw" 32bit instructions.
Example:
I decode ""fstsw ax" (bytes 9B DF E0) the decoder says it's just one byte in size and the text "wait" is returned by "ud_insn_asm()"

Oddly this must be a particularly tricky thing to decode because I have not seen one dissembler get it right For example the same problem exists in BeaEngine (where I also submitted this issue several months ago with no response yet).

P.S. udis86 (despite this issue) seems to be one of the, if not the most, complete around.

SIMD instructions should include operand size/type

Currently, the SIMD instructions in optable.xml just use V and W (typically) to describe their arguments. While this is sufficient for basic disassembly, (1) it is different from the integer instructions, which do include the argument size, and (2) it does not support enhanced disassembly such as a disassembler that prints out the referenced value for instructions that reference constant values in memory. I advocate upgrading these to Vss, Vsd, Vps, Vpd, etc., as in the Intel documentation, with those getting mapped appropriately by scripts/ud_itab.py, e.g., Vss mapped to OP_V + SZ_D.

Documentation for optable.xml

I've written basic documentation for the most important elements in optable.xml (pfx, opc, opr). I think this should go at the head of the optable.xml file.

Autoconf issues on ArchLinux

Trying to build udis86 on recent ArchLinux fails during the autoconf phase:

$> ./autogen.sh
autoreconf: Entering directory .' autoreconf: configure.ac: not using Gettext autoreconf: running: aclocal --force -I m4 autoreconf: configure.ac: tracing autoreconf: running: libtoolize --copy --force libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR,build'.
libtoolize: copying file build/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR,m4'.
libtoolize: copying file m4/libtool.m4' libtoolize: copying filem4/ltoptions.m4'
libtoolize: copying file m4/ltsugar.m4' libtoolize: copying filem4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
autoreconf: running: /usr/bin/autoconf --force
autoreconf: running: /usr/bin/autoheader --force
autoreconf: running: automake --add-missing --copy --force-missing
automake: warnings are treated as errors
/usr/share/automake-1.13/am/ltlibrary.am: warning: 'libudis86.la': linking libtool libraries using a non-POSIX
/usr/share/automake-1.13/am/ltlibrary.am: archiver requires 'AM_PROG_AR' in 'configure.ac'
libudis86/Makefile.am:8: while processing Libtool library 'libudis86.la'
tests/Makefile.am:58: warning: call oprtest_generate,64: non-POSIX variable name
tests/Makefile.am:58: (probably a GNU make extension)
tests/Makefile.am:59: warning: call oprtest_generate,32: non-POSIX variable name
tests/Makefile.am:59: (probably a GNU make extension)
tests/Makefile.am:60: warning: call oprtest_generate,16: non-POSIX variable name
tests/Makefile.am:60: (probably a GNU make extension)
tests/Makefile.am:117: warning: call diff_test_asm,"diff": non-POSIX variable name
tests/Makefile.am:117: (probably a GNU make extension)
tests/Makefile.am:122: warning: call diff_test_asm,"refup": non-POSIX variable name
tests/Makefile.am:122: (probably a GNU make extension)
autoreconf: automake failed with exit status: 1
autogen: autoreconf -i failed.

$> autoconf --version
autoconf (GNU Autoconf) 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+/Autoconf: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html, http://gnu.org/licenses/exceptions.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David J. MacKenzie and Akim Demaille.

$> automake --version
automake (GNU automake) 1.13.1
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl-2.0.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Tom Tromey [email protected]
and Alexandre Duret-Lutz [email protected].

This behavior seems to stem from a change that came in automake 1.12 according to http://lists.gnu.org/archive/html/bug-automake/2012-05/msg00009.html

x86 VMX instruction set is not recognized

Hello,
x86 VMX instruction set is not recognized by udis86

[XX] disasm: x86: [VMCALL]
[XX] disasm: x86: [VMLAUNCH]
[XX] disasm: x86: [VMRESUME]
[XX] disasm: x86: [VMXOFF]

From the radare2-regressions

64 bit decoding error

Hi,

I think I have found a bug with either the library or the udcli tool. For testing purposes I assembled the following program (nasm syntax):

mov eax, ebx
mov [ecx], eax
push qword [r8]
sfence
iretq

I created an object file using nasm and converted the object file code into binary using objcopy. When I disassemble the object file using objdump I see the follwing (and correct) output:

objdump -d test.o
0: 89 d8 mov %ebx,%eax
2: 67 89 01 mov %eax,(%ecx)
5: 41 ff 30 pushq (%r8)
8: 0f ae f8 sfence
b: 48 cf iretq

When I use udcli with the binary file I get the following (incorrect) output:

./udcli < test.bin
0000000000000000 89d8 mov eax, ebx
0000000000000002 678901 mov [bx+di], eax
0000000000000005 41 inc ecx
0000000000000006 ff30 push dword [eax]
0000000000000008 0faef8 sfence
000000000000000b 48 dec eax
000000000000000c cf iretd

Segment prefix not printed by udcli for cmpsd, cmpsw, cmpsb instructions

The last three instructions are missing their "gs" segment prefix in the printout:

000000000000196c a7 cmpsd
000000000000196d 66a7 cmpsw
000000000000196f a6 cmpsb
0000000000001970 65a7 cmpsd
0000000000001972 6566a7 cmpsw
0000000000001975 65a6 cmpsb

By way of comparison, here are the corresponding versions of lodsd, lodsw, lodsb:

0000000000001977 ad lodsd
0000000000001978 66ad lodsw
000000000000197a ac lodsb
000000000000197b 65ad gs lodsd
000000000000197d 6566ad gs lodsw
0000000000001980 65ac gs lodsb

x64 mode 066h & REX.W bug?

Hello,

UDIs86 v1.7.2 in 64-bit mode disassemble "66 48 68 01 23 45 67" sequence as 5-bytes long "push 2301h" instruction.
But it must be 7-bytes long "push 067452301h" (REX.W bit must supersede 066h data-size override prefix).

Actually UDis86 just lose REX.W in resolve_mode() due to invalid flags in u->itab_entry->prefix for 068h opcode and fall into invalid 16-bit mode disassembling.
Quick bruteforcing also shows such issue for 0E8h & 0E9h opcodes (I checked only 1-byte opcodes and only for 066h & REX.W mixing).

Wrong decoding of memory operand sizes

Given the example x64 instruction "83 3d 29 d4 20 00 00" I get the following result in Intel syntax:

cmp dword ptr [rip+0x20d429], 0x00

While it IS correct that the argument in the displacement is 32 bit inside the instruction this decoding of the instruction doesn't reflect the fact that the memory access is on byte level and thus

cmp byte ptr [rip+0x20d429], 0x00

is the correct decoding of this instruction. The word ptr and dword ptr versions would be using primary opcode 0x81. with and without operand size prefix.

TL;DR: The operand size designation byte/word/dword should reflect the size of accessed memory, not the size of the used address offset.

/rm= not needed for lfence, mfence, prefetch, sfence

These instructions in optable.xml have each of the 8 /rm= cases broken out separately, but this is unnecessary, since the definitions in all cases are identical. They should be collapsed back to a single definition with no /rm=.

Decode offset of operand inside instruction

It would be nice if there was an easy way to figure out where the lvalue information of an operand physically came from. Considering an x64 instruction like "83 3d 29 d4 20 00 00" it would be nice to get an additional field "insn_offset" or simular in each operand structure containing the offsets of each operand (2 and 6 in this case). Currently I haven't found an straight forward way to get this information without re-doing all the parsing work or guessing the offsets by pattern matching the lvalues into the byte-stream.

ud_opcode.py does not parse some instructions correctly

If an instruction has a cpuid child node that comes after a def child node, then the parsed insnDef will not have the correct cpuid. This broke the avx form of movlps.

The solution is to make sure that def child nodes always come after the cpuid and vendor child nodes, or we can just iterate over the instruction child nodes twice. The latter solution:

diff --git a/scripts/ud_opcode.py b/scripts/ud_opcode.py
index d858b0e..cc20358 100644
--- a/scripts/ud_opcode.py
+++ b/scripts/ud_opcode.py
@@ -591,7 +591,8 @@ class UdOpcodeTables(object):
                     vendor = node.firstChild.data.split()
                 elif node.localName == 'cpuid':
                     cpuid = node.firstChild.data.split()
-                elif node.localName == 'def':
+            for node in insnNode.childNodes:
+                if node.localName == 'def':
                     insnDef = { 'pfx' : [] }
                     for node in node.childNodes:
                         if not node.localName:

udis86 does not recognize rdrand

rdrand isn't correctly decoded:

  echo 48 0f c7 f0 | ~/src/udis86/udcli/udcli -64 -x
  0000000000000000 480fc7f0         invalid

and this comes up in the kernel:

  # define RDRAND_LONG  ".byte 0x48,0x0f,0xc7,0xf0"

But I can't find a way to describe this in the xml that doesn't cause overwriting of existing opcodes, it looks like it clashes with the cmpxchg opcodes.

repe prefix prints out as rep in udcli

00000000000019b0 f3a7 rep cmpsd
00000000000019b2 f366a7 rep cmpsw
00000000000019b5 f3a6 rep cmpsb

0000000000001a08 f3af rep scasd
0000000000001a0a f366af rep scasw
0000000000001a0d f3ae rep scasb

These should all have "repe" instead of "rep".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.