vmt / udis86 Goto Github PK

View Code? Open in Web Editor NEW

994.0 994.0 298.0 2.43 MB

Disassembler Library for x86 and x86-64

Home Page: http://udis86.sourceforge.net

License: BSD 2-Clause "Simplified" License

Shell 0.64% Python 37.38% CSS 0.90% XSLT 1.22% C 53.78% Assembly 6.08%

udis86's People

Contributors

Stargazers

Watchers

Forkers

fuzzie dstogov justinstenning bjoernd doomhammer tnzk radare radareorg frida kenjiaiko ghghost vardyh brendanlong nfedera fengye110 ebfe efrenacosta frerich brightcui falconkirtaran yukisakamoto sgraf812 mfusaro idkwim vegard bsr43 cherry-wb n3ur0n blastarindia spnow amesianx semi xikug btuduri iomato luiseduardohdbackup csersoft nufroftsuj hsheep edwardwu99 chubbymaggie jokerni zhujian198 pianoid yehudaitkin mynameisfashanu lsgxeva aleffnull homecracker hasherezade laie shinyanakashima robsonfr mavenrain aevitas lichesser alcaro cmpham alexwmf killvxk timboy67678 yy-yyaa gnaservicesinc tonyg madmoose bjourne unixfreaxjp code4bones xwlan lbpinkston h4ck3rm1k3 claudiouzelac boyang987 skyfish4tb jingsao tpn coffeecup-winner dyndrilliac jameshzc averyos supertanglang volt72 lovethisgame el2ro lackofentropy deki0r pabit qiyeboy sheldonrobinson nansongcheng bjblcracked djhenderson meilinxiaoxue celestialwy huangkbaaron solertis jackbro sigsegv-mvm redteamcaliber matthewfl

udis86's Issues

Unnecessary type == OP_F check in decode.c:decode_operand

The OP_F case reads:

  if (type == OP_F) {
    u->br_far  = 1;
  }

But this code can only be executed if type == OP_F, so the check should be removed.

Warnings in syn.c

There are some warnings in syn.c, could be fixed with a trivial patch.

CC [M]  syn.o
syn.c: In function ‘ud_syn_print_addr’:
syn.c:140:9: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t’ [-Wformat]
syn.c:147:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c: In function ‘ud_syn_print_imm’:
syn.c:174:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c: In function ‘ud_syn_print_mem_disp’:
syn.c:192:5: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c:203:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘int64_t’ [-Wformat]
syn.c:205:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘int64_t’ [-Wformat]

udis86 cannot into SSE

https://www.dropbox.com/s/aolwqzmqo7ffw7p/test
https://www.dropbox.com/s/yxi4lhjlhwfai2n/test.c

Mis-spelled mnemonics in optable.xml: fldlpi, fpxtract, fncstp

These should be fldpi, fxtract, and fincstp respectively.

optable.xml is missing popcnt (f3 0f b8) and maskmovdqu (66 0f f7)

With these additions (and the ones reported in issues #22 and #34), optable.xml is complete (pre-AVX), checked against the opcode tables in the Intel reference documentation.

Missing instructions: pextrw (66 0f 3a 15), cmpxch16b, all the AES instructions

These need to be added.

Hello, maintainer of the Ruby FFI bindings for udis86 here. I want to automate the testing of my bindings on Travis CI, which uses git hooks to run tests in a Ubuntu 12.04 VM. Unfortunately, Ubuntu appears to not have any udis86 packages. We should try to get udis86 into the Ubuntu package repository.

Tag new releases

Hello!
I'd like to include your program in MacPorts. It would be much easier for me to maintain the port if you could tag new releases. Thanks for your work!

Using udis86 inside a kernel module

I'm having some problems using this library because of https://github.com/vmt/udis86/blob/master/libudis86/types.h#L48 which isn't found when compiling the problem for using it from inside a linux kernel module.

I found that just commenting this line will make everything compile and work fine.
Any ideas is that's the proper fix to my problem?
Or maybe my problem isn't because of this line but because of something else?

movddup, movsldup, movshdup separate /mod=11 and /mod=!11 unnecessarily

These instructions have separate /mod=11 and /mod=!11 definitions, but the definitions are identical. I think the definitions should be merged and the /mod= removed.

missing sign in udis86 operand value

Hello!
From udis86 disassembler I need to get such information as

ulong adrconst; // Constant part of address

But in udis86 I can't find a similar variable. So I try to find out it analyzing the fields of udis object structure.

00000000: 8945EC mov [di][-014],ax

When I try this instruction in OllyDbg dissambler I get
adrconst (constant part of address) = 0xffffffec

But in udis86 object structure I see:
mnemonic = UD_Imov
operand type = UD_OP_MEM
operand value (sdword) = 0x000000ec

So why do I get 0xEC instead of oxFFFFFFEC ? How can I determine that in this instruction we are working with negative value? (-014)

I hope that I explained the problem clearly. I will be grateful for any answers!

fclex/fnclex

$ rasm2 -a x86 -d dbe2
fclex
$ rasm2 -a x86 -d 9bdbe2
wait
fclex

According to http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/vc87.htm, both disassembles are wrong.

Documentation for OP_* and SZ_* is needed

I've written documentation for these values; I think it should go in decode.h interleaved with the actual definitions of the enumerated types.

API changed between 1.7 and 1.7.1

The layout of ud_operand and ud have changed.

Are those part of the public API? Seems like they shouldn't change in minor point releases?

Is there a way to get the version of ud so a client library could determine which layout to use when using it from another language via an FFI bridge approach?

outsq shouldn't exist

The Intel documentation says explicitly that the 'outs' instruction is not affected by REX.W, so it never has a 64-bit operand.

ud_insn_mnemonic() function

Documentation points to an inexistent function called ud_insn_mnemonic(), has it been removed or never existed at all?

Add AVX instructions.

Add support for AVX class for instructions.

pextrd and pinsrd have /o=16 definitions

I don't think these instructions can take 16-bit operands, so these definitions should be removed.

OP_R case in decode.c:decode_operand doesn't check MODRM_MOD == 3

OP_R requires that the modrm byte designate a register (like OP_N and OP_U), but the code to check this is missing.

Mystery instruction in optable.xml: mnemonic db, no definition

I think this should be removed.

FSTSW/FNSTSW store x87 FPU instruction bad decode

udis86 doesn't do proper decoding for most if not all "fstsw" a "fstcw" 32bit instructions.
Example:
I decode ""fstsw ax" (bytes 9B DF E0) the decoder says it's just one byte in size and the text "wait" is returned by "ud_insn_asm()"

Oddly this must be a particularly tricky thing to decode because I have not seen one dissembler get it right For example the same problem exists in BeaEngine (where I also submitted this issue several months ago with no response yet).

P.S. udis86 (despite this issue) seems to be one of the, if not the most, complete around.

Is it possible to get instruction dependency graph?

Just curious.

eg:

(a) mov ecx, 4
(b) mov ebx, 5
(c) shr ebx, cl // depends on (a) and (b)

It would obviously make problems for conditional jumps.

SIMD instructions should include operand size/type

Currently, the SIMD instructions in optable.xml just use V and W (typically) to describe their arguments. While this is sufficient for basic disassembly, (1) it is different from the integer instructions, which do include the argument size, and (2) it does not support enhanced disassembly such as a disassembler that prints out the referenced value for instructions that reference constant values in memory. I advocate upgrading these to Vss, Vsd, Vps, Vpd, etc., as in the Intel documentation, with those getting mapped appropriately by scripts/ud_itab.py, e.g., Vss mapped to OP_V + SZ_D.

Documentation for optable.xml

I've written basic documentation for the most important elements in optable.xml (pfx, opc, opr). I think this should go at the head of the optable.xml file.

How to compile libudis86.so.0 ?

configure make install left only .a behind, but I need to get .so somewhere

REPE instructions incorrectly disassembled

Originally reported here: radareorg/radare2#97

According to the Intel® 64 and IA-32 Architectures Software Developer’s Manual,

"rep cmpsd" should be "repe cmpsd"
"rep scasd" should be "repe scasd"

example:

% rasm2 -d F3A7
rep cmpsd
% rasm2 -d F3AF
rep scasd

ATT syntax rendering, REPE is not rendered

Bug is here:
https://github.com/vmt/udis86/blob/master/libudis86/syn-att.c#L160

  } else if (u->pfx_rep) { // <--------------- should be u->pfx_repe
    ud_asmprintf(u, "repe ");

Autoconf issues on ArchLinux

Trying to build udis86 on recent ArchLinux fails during the autoconf phase:

$> ./autogen.sh
autoreconf: Entering directory .' autoreconf: configure.ac: not using Gettext autoreconf: running: aclocal --force -I m4 autoreconf: configure.ac: tracing autoreconf: running: libtoolize --copy --force libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR,build'.
libtoolize: copying file build/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR,m4'.
libtoolize: copying file m4/libtool.m4' libtoolize: copying filem4/ltoptions.m4'
libtoolize: copying file m4/ltsugar.m4' libtoolize: copying filem4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
autoreconf: running: /usr/bin/autoconf --force
autoreconf: running: /usr/bin/autoheader --force
autoreconf: running: automake --add-missing --copy --force-missing
automake: warnings are treated as errors
/usr/share/automake-1.13/am/ltlibrary.am: warning: 'libudis86.la': linking libtool libraries using a non-POSIX
/usr/share/automake-1.13/am/ltlibrary.am: archiver requires 'AM_PROG_AR' in 'configure.ac'
libudis86/Makefile.am:8: while processing Libtool library 'libudis86.la'
tests/Makefile.am:58: warning: call oprtest_generate,64: non-POSIX variable name
tests/Makefile.am:58: (probably a GNU make extension)
tests/Makefile.am:59: warning: call oprtest_generate,32: non-POSIX variable name
tests/Makefile.am:59: (probably a GNU make extension)
tests/Makefile.am:60: warning: call oprtest_generate,16: non-POSIX variable name
tests/Makefile.am:60: (probably a GNU make extension)
tests/Makefile.am:117: warning: call diff_test_asm,"diff": non-POSIX variable name
tests/Makefile.am:117: (probably a GNU make extension)
tests/Makefile.am:122: warning: call diff_test_asm,"refup": non-POSIX variable name
tests/Makefile.am:122: (probably a GNU make extension)
autoreconf: automake failed with exit status: 1
autogen: autoreconf -i failed.

$> autoconf --version
autoconf (GNU Autoconf) 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+/Autoconf: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html, http://gnu.org/licenses/exceptions.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David J. MacKenzie and Akim Demaille.

$> automake --version
automake (GNU automake) 1.13.1
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl-2.0.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Tom Tromey [email protected]
and Alexandre Duret-Lutz [email protected].

This behavior seems to stem from a change that came in automake 1.12 according to http://lists.gnu.org/archive/html/bug-automake/2012-05/msg00009.html

x86 VMX instruction set is not recognized

Hello,
x86 VMX instruction set is not recognized by udis86

[XX] disasm: x86: [VMCALL]
[XX] disasm: x86: [VMLAUNCH]
[XX] disasm: x86: [VMRESUME]
[XX] disasm: x86: [VMXOFF]

From the radare2-regressions

64 bit decoding error

Hi,

I think I have found a bug with either the library or the udcli tool. For testing purposes I assembled the following program (nasm syntax):

mov eax, ebx
mov [ecx], eax
push qword [r8]
sfence
iretq

I created an object file using nasm and converted the object file code into binary using objcopy. When I disassemble the object file using objdump I see the follwing (and correct) output:

objdump -d test.o
0: 89 d8 mov %ebx,%eax
2: 67 89 01 mov %eax,(%ecx)
5: 41 ff 30 pushq (%r8)
8: 0f ae f8 sfence
b: 48 cf iretq

When I use udcli with the binary file I get the following (incorrect) output:

./udcli < test.bin
0000000000000000 89d8 mov eax, ebx
0000000000000002 678901 mov [bx+di], eax
0000000000000005 41 inc ecx
0000000000000006 ff30 push dword [eax]
0000000000000008 0faef8 sfence
000000000000000b 48 dec eax
000000000000000c cf iretd

Build requires running autogen.sh, but this isn't documented

Either README or INSTALL should say that ./autogen.sh is required before running the build steps currently documented at the beginning of INSTALL.

Segment prefix not printed by udcli for cmpsd, cmpsw, cmpsb instructions

The last three instructions are missing their "gs" segment prefix in the printout:

000000000000196c a7 cmpsd
000000000000196d 66a7 cmpsw
000000000000196f a6 cmpsb
0000000000001970 65a7 cmpsd
0000000000001972 6566a7 cmpsw
0000000000001975 65a6 cmpsb

By way of comparison, here are the corresponding versions of lodsd, lodsw, lodsb:

0000000000001977 ad lodsd
0000000000001978 66ad lodsw
000000000000197a ac lodsb
000000000000197b 65ad gs lodsd
000000000000197d 6566ad gs lodsw
0000000000001980 65ac gs lodsb

x64 mode 066h & REX.W bug?

Hello,

UDIs86 v1.7.2 in 64-bit mode disassemble "66 48 68 01 23 45 67" sequence as 5-bytes long "push 2301h" instruction.
But it must be 7-bytes long "push 067452301h" (REX.W bit must supersede 066h data-size override prefix).

Actually UDis86 just lose REX.W in resolve_mode() due to invalid flags in u->itab_entry->prefix for 068h opcode and fall into invalid 16-bit mode disassembling.
Quick bruteforcing also shows such issue for 0E8h & 0E9h opcodes (I checked only 1-byte opcodes and only for 066h & REX.W mixing).

optable.xml: lmsw, smsw have identical /mod=11, /mod=!11 definitions

The definitions should be merged into a single one that doesn't mention /mod=.

Wrong decoding of memory operand sizes

Given the example x64 instruction "83 3d 29 d4 20 00 00" I get the following result in Intel syntax:

cmp dword ptr [rip+0x20d429], 0x00

While it IS correct that the argument in the displacement is 32 bit inside the instruction this decoding of the instruction doesn't reflect the fact that the memory access is on byte level and thus

cmp byte ptr [rip+0x20d429], 0x00

is the correct decoding of this instruction. The word ptr and dword ptr versions would be using primary opcode 0x81. with and without operand size prefix.

TL;DR: The operand size designation byte/word/dword should reflect the size of accessed memory, not the size of the used address offset.

cmp (83 /reg=7) should sign-extend its immediate operand

I.e., its second operand should be sIb rather than Ib.

Ordering of instructions in optable.xml is random from SSSE 3 on

Instructions through SSE 3 are in alpha order by mnemonic within each group, but from SSSE 3 on, they seem to be in random order. I'd like to put every group in alpha order. Comments?

optable.xml: <class> and <vendor> can be children of either <instruction> or <def>

Is this deliberate? If so, I would like to document it; if not, I would like to fix it. I don't have an opinion either way, but since I'm going to be adding instruction definitions, I want to do it consistently one way or the other.

can't disassemble MULSD, MULSS commands

66f3f20f59ff disassembling:

invalid

should be 'MULSD xmm7, xmm7'

66f2f30f59ff disassembling:

invalid

should be 'MULSS xmm7, xmm7'

f2660f59ff disassembling:

invalid

should be 'MULSD xmm7, xmm7'

source is here http://habrahabr.ru/company/intel/blog/200658/

can be easily reproduced also in radareorg/radare2#368

ud_opr_is_gpr named wrong in extern.h

There's a function called ud_opr_is_gpr in extern.h, but it's actually defined in udis86.c as ud_opr_isgpr.

/rm= not needed for lfence, mfence, prefetch, sfence

These instructions in optable.xml have each of the 8 /rm= cases broken out separately, but this is unnecessary, since the definitions in all cases are identical. They should be collapsed back to a single definition with no /rm=.

Decode offset of operand inside instruction

It would be nice if there was an easy way to figure out where the lvalue information of an operand physically came from. Considering an x64 instruction like "83 3d 29 d4 20 00 00" it would be nice to get an additional field "insn_offset" or simular in each operand structure containing the offsets of each operand (2 and 6 in this case). Currently I haven't found an straight forward way to get this information without re-doing all the parsing work or guessing the offsets by pattern matching the lvalues into the byte-stream.

diff --git a/scripts/ud_opcode.py b/scripts/ud_opcode.py
index d858b0e..cc20358 100644
--- a/scripts/ud_opcode.py
+++ b/scripts/ud_opcode.py
@@ -591,7 +591,8 @@ class UdOpcodeTables(object):
                     vendor = node.firstChild.data.split()
                 elif node.localName == 'cpuid':
                     cpuid = node.firstChild.data.split()
-                elif node.localName == 'def':
+            for node in insnNode.childNodes:
+                if node.localName == 'def':
                     insnDef = { 'pfx' : [] }
                     for node in node.childNodes:
                         if not node.localName:

udis86 does not recognize rdrand

rdrand isn't correctly decoded:

  echo 48 0f c7 f0 | ~/src/udis86/udcli/udcli -64 -x
  0000000000000000 480fc7f0         invalid

and this comes up in the kernel:

  # define RDRAND_LONG  ".byte 0x48,0x0f,0xc7,0xf0"

But I can't find a way to describe this in the xml that doesn't cause overwriting of existing opcodes, it looks like it clashes with the cmpxchg opcodes.

repe prefix prints out as rep in udcli

00000000000019b0 f3a7 rep cmpsd
00000000000019b2 f366a7 rep cmpsw
00000000000019b5 f3a6 rep cmpsb

0000000000001a08 f3af rep scasd
0000000000001a0a f366af rep scasw
0000000000001a0d f3ae rep scasb

These should all have "repe" instead of "rep".

vmt / udis86 Goto Github PK

udis86's People

Contributors

Stargazers

Watchers

Forkers

udis86's Issues

Recommend Projects

Recommend Topics

Recommend Org