vmt / udis86 Goto Github PK
View Code? Open in Web Editor NEWDisassembler Library for x86 and x86-64
Home Page: http://udis86.sourceforge.net
License: BSD 2-Clause "Simplified" License
Disassembler Library for x86 and x86-64
Home Page: http://udis86.sourceforge.net
License: BSD 2-Clause "Simplified" License
The OP_F case reads:
if (type == OP_F) {
u->br_far = 1;
}
But this code can only be executed if type == OP_F, so the check should be removed.
There are some warnings in syn.c, could be fixed with a trivial patch.
CC [M] syn.o
syn.c: In function ‘ud_syn_print_addr’:
syn.c:140:9: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t’ [-Wformat]
syn.c:147:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c: In function ‘ud_syn_print_imm’:
syn.c:174:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c: In function ‘ud_syn_print_mem_disp’:
syn.c:192:5: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat]
syn.c:203:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘int64_t’ [-Wformat]
syn.c:205:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘int64_t’ [-Wformat]
These should be fldpi, fxtract, and fincstp respectively.
These need to be added.
Hello, maintainer of the Ruby FFI bindings for udis86 here. I want to automate the testing of my bindings on Travis CI, which uses git hooks to run tests in a Ubuntu 12.04 VM. Unfortunately, Ubuntu appears to not have any udis86 packages. We should try to get udis86 into the Ubuntu package repository.
Hello!
I'd like to include your program in MacPorts. It would be much easier for me to maintain the port if you could tag new releases. Thanks for your work!
I'm having some problems using this library because of https://github.com/vmt/udis86/blob/master/libudis86/types.h#L48 which isn't found when compiling the problem for using it from inside a linux kernel module.
I found that just commenting this line will make everything compile and work fine.
Any ideas is that's the proper fix to my problem?
Or maybe my problem isn't because of this line but because of something else?
These instructions have separate /mod=11 and /mod=!11 definitions, but the definitions are identical. I think the definitions should be merged and the /mod= removed.
Hello!
From udis86 disassembler I need to get such information as
ulong adrconst; // Constant part of address
But in udis86 I can't find a similar variable. So I try to find out it analyzing the fields of udis object structure.
00000000: 8945EC mov [di][-014],ax
When I try this instruction in OllyDbg dissambler I get
adrconst (constant part of address) = 0xffffffec
But in udis86 object structure I see:
mnemonic = UD_Imov
operand type = UD_OP_MEM
operand value (sdword) = 0x000000ec
So why do I get 0xEC instead of oxFFFFFFEC ? How can I determine that in this instruction we are working with negative value? (-014)
I hope that I explained the problem clearly. I will be grateful for any answers!
$ rasm2 -a x86 -d dbe2
fclex
$ rasm2 -a x86 -d 9bdbe2
wait
fclex
According to http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/vc87.htm, both disassembles are wrong.
I've written documentation for these values; I think it should go in decode.h interleaved with the actual definitions of the enumerated types.
The layout of ud_operand
and ud
have changed.
Are those part of the public API? Seems like they shouldn't change in minor point releases?
Is there a way to get the version of ud
so a client library could determine which layout to use when using it from another language via an FFI bridge approach?
The Intel documentation says explicitly that the 'outs' instruction is not affected by REX.W, so it never has a 64-bit operand.
Documentation points to an inexistent function called ud_insn_mnemonic(), has it been removed or never existed at all?
Add support for AVX class for instructions.
I don't think these instructions can take 16-bit operands, so these definitions should be removed.
OP_R requires that the modrm byte designate a register (like OP_N and OP_U), but the code to check this is missing.
I think this should be removed.
udis86 doesn't do proper decoding for most if not all "fstsw" a "fstcw" 32bit instructions.
Example:
I decode ""fstsw ax" (bytes 9B DF E0) the decoder says it's just one byte in size and the text "wait" is returned by "ud_insn_asm()"
Oddly this must be a particularly tricky thing to decode because I have not seen one dissembler get it right For example the same problem exists in BeaEngine (where I also submitted this issue several months ago with no response yet).
P.S. udis86 (despite this issue) seems to be one of the, if not the most, complete around.
Just curious.
eg:
(a) mov ecx, 4
(b) mov ebx, 5
(c) shr ebx, cl // depends on (a) and (b)
It would obviously make problems for conditional jumps.
Currently, the SIMD instructions in optable.xml just use V and W (typically) to describe their arguments. While this is sufficient for basic disassembly, (1) it is different from the integer instructions, which do include the argument size, and (2) it does not support enhanced disassembly such as a disassembler that prints out the referenced value for instructions that reference constant values in memory. I advocate upgrading these to Vss, Vsd, Vps, Vpd, etc., as in the Intel documentation, with those getting mapped appropriately by scripts/ud_itab.py, e.g., Vss mapped to OP_V + SZ_D.
I've written basic documentation for the most important elements in optable.xml (pfx, opc, opr). I think this should go at the head of the optable.xml file.
configure make install left only .a behind, but I need to get .so somewhere
Originally reported here: radareorg/radare2#97
According to the Intel® 64 and IA-32 Architectures Software Developer’s Manual,
"rep cmpsd" should be "repe cmpsd"
"rep scasd" should be "repe scasd"
example:
% rasm2 -d F3A7
rep cmpsd
% rasm2 -d F3AF
rep scasd
Bug is here:
https://github.com/vmt/udis86/blob/master/libudis86/syn-att.c#L160
} else if (u->pfx_rep) { // <--------------- should be u->pfx_repe
ud_asmprintf(u, "repe ");
Trying to build udis86 on recent ArchLinux fails during the autoconf phase:
$> ./autogen.sh
autoreconf: Entering directory .' autoreconf: configure.ac: not using Gettext autoreconf: running: aclocal --force -I m4 autoreconf: configure.ac: tracing autoreconf: running: libtoolize --copy --force libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR,
build'.
libtoolize: copying file build/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR,
m4'.
libtoolize: copying file m4/libtool.m4' libtoolize: copying file
m4/ltoptions.m4'
libtoolize: copying file m4/ltsugar.m4' libtoolize: copying file
m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
autoreconf: running: /usr/bin/autoconf --force
autoreconf: running: /usr/bin/autoheader --force
autoreconf: running: automake --add-missing --copy --force-missing
automake: warnings are treated as errors
/usr/share/automake-1.13/am/ltlibrary.am: warning: 'libudis86.la': linking libtool libraries using a non-POSIX
/usr/share/automake-1.13/am/ltlibrary.am: archiver requires 'AM_PROG_AR' in 'configure.ac'
libudis86/Makefile.am:8: while processing Libtool library 'libudis86.la'
tests/Makefile.am:58: warning: call oprtest_generate,64: non-POSIX variable name
tests/Makefile.am:58: (probably a GNU make extension)
tests/Makefile.am:59: warning: call oprtest_generate,32: non-POSIX variable name
tests/Makefile.am:59: (probably a GNU make extension)
tests/Makefile.am:60: warning: call oprtest_generate,16: non-POSIX variable name
tests/Makefile.am:60: (probably a GNU make extension)
tests/Makefile.am:117: warning: call diff_test_asm,"diff": non-POSIX variable name
tests/Makefile.am:117: (probably a GNU make extension)
tests/Makefile.am:122: warning: call diff_test_asm,"refup": non-POSIX variable name
tests/Makefile.am:122: (probably a GNU make extension)
autoreconf: automake failed with exit status: 1
autogen: autoreconf -i failed.
$> autoconf --version
autoconf (GNU Autoconf) 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+/Autoconf: GNU GPL version 3 or later
http://gnu.org/licenses/gpl.html, http://gnu.org/licenses/exceptions.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by David J. MacKenzie and Akim Demaille.
$> automake --version
automake (GNU automake) 1.13.1
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later http://gnu.org/licenses/gpl-2.0.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Tom Tromey [email protected]
and Alexandre Duret-Lutz [email protected].
This behavior seems to stem from a change that came in automake 1.12 according to http://lists.gnu.org/archive/html/bug-automake/2012-05/msg00009.html
Hello,
x86 VMX instruction set is not recognized by udis86
[XX] disasm: x86: [VMCALL]
[XX] disasm: x86: [VMLAUNCH]
[XX] disasm: x86: [VMRESUME]
[XX] disasm: x86: [VMXOFF]
From the radare2-regressions
Hi,
I think I have found a bug with either the library or the udcli tool. For testing purposes I assembled the following program (nasm syntax):
mov eax, ebx
mov [ecx], eax
push qword [r8]
sfence
iretq
I created an object file using nasm and converted the object file code into binary using objcopy. When I disassemble the object file using objdump I see the follwing (and correct) output:
objdump -d test.o
0: 89 d8 mov %ebx,%eax
2: 67 89 01 mov %eax,(%ecx)
5: 41 ff 30 pushq (%r8)
8: 0f ae f8 sfence
b: 48 cf iretq
When I use udcli with the binary file I get the following (incorrect) output:
./udcli < test.bin
0000000000000000 89d8 mov eax, ebx
0000000000000002 678901 mov [bx+di], eax
0000000000000005 41 inc ecx
0000000000000006 ff30 push dword [eax]
0000000000000008 0faef8 sfence
000000000000000b 48 dec eax
000000000000000c cf iretd
Either README or INSTALL should say that ./autogen.sh is required before running the build steps currently documented at the beginning of INSTALL.
The last three instructions are missing their "gs" segment prefix in the printout:
000000000000196c a7 cmpsd
000000000000196d 66a7 cmpsw
000000000000196f a6 cmpsb
0000000000001970 65a7 cmpsd
0000000000001972 6566a7 cmpsw
0000000000001975 65a6 cmpsb
By way of comparison, here are the corresponding versions of lodsd, lodsw, lodsb:
0000000000001977 ad lodsd
0000000000001978 66ad lodsw
000000000000197a ac lodsb
000000000000197b 65ad gs lodsd
000000000000197d 6566ad gs lodsw
0000000000001980 65ac gs lodsb
Hello,
UDIs86 v1.7.2 in 64-bit mode disassemble "66 48 68 01 23 45 67" sequence as 5-bytes long "push 2301h" instruction.
But it must be 7-bytes long "push 067452301h" (REX.W bit must supersede 066h data-size override prefix).
Actually UDis86 just lose REX.W in resolve_mode() due to invalid flags in u->itab_entry->prefix for 068h opcode and fall into invalid 16-bit mode disassembling.
Quick bruteforcing also shows such issue for 0E8h & 0E9h opcodes (I checked only 1-byte opcodes and only for 066h & REX.W mixing).
The definitions should be merged into a single one that doesn't mention /mod=.
Given the example x64 instruction "83 3d 29 d4 20 00 00" I get the following result in Intel syntax:
cmp dword ptr [rip+0x20d429], 0x00
While it IS correct that the argument in the displacement is 32 bit inside the instruction this decoding of the instruction doesn't reflect the fact that the memory access is on byte level and thus
cmp byte ptr [rip+0x20d429], 0x00
is the correct decoding of this instruction. The word ptr and dword ptr versions would be using primary opcode 0x81. with and without operand size prefix.
TL;DR: The operand size designation byte/word/dword should reflect the size of accessed memory, not the size of the used address offset.
I.e., its second operand should be sIb rather than Ib.
Instructions through SSE 3 are in alpha order by mnemonic within each group, but from SSSE 3 on, they seem to be in random order. I'd like to put every group in alpha order. Comments?
Is this deliberate? If so, I would like to document it; if not, I would like to fix it. I don't have an opinion either way, but since I'm going to be adding instruction definitions, I want to do it consistently one way or the other.
66f3f20f59ff disassembling:
invalid
should be 'MULSD xmm7, xmm7'
66f2f30f59ff disassembling:
invalid
should be 'MULSS xmm7, xmm7'
f2660f59ff disassembling:
invalid
should be 'MULSD xmm7, xmm7'
source is here http://habrahabr.ru/company/intel/blog/200658/
can be easily reproduced also in radareorg/radare2#368
There's a function called ud_opr_is_gpr
in extern.h, but it's actually defined in udis86.c as ud_opr_isgpr
.
These instructions in optable.xml have each of the 8 /rm= cases broken out separately, but this is unnecessary, since the definitions in all cases are identical. They should be collapsed back to a single definition with no /rm=.
It would be nice if there was an easy way to figure out where the lvalue information of an operand physically came from. Considering an x64 instruction like "83 3d 29 d4 20 00 00" it would be nice to get an additional field "insn_offset" or simular in each operand structure containing the offsets of each operand (2 and 6 in this case). Currently I haven't found an straight forward way to get this information without re-doing all the parsing work or guessing the offsets by pattern matching the lvalues into the byte-stream.
The hexcode C5FE7F442420
fails to decode.
It should decode to vmovdqu ymmword ptr [rsp + 0x20], ymm0
The former mnemonics are currently in optable.xml, but the latter are more consistent, given that the 'n' forms are not used for the other conditions, and should be used instead.
I think these are the last missing pre-AVX instructions.
This is a 32-bit operand, but if REX.W=1 in 64-bit mode, the operand is sign-extended to 64 bits. So I think the operand should be sIz rather than Iz.
For consistency with repne.
If an instruction has a cpuid child node that comes after a def child node, then the parsed insnDef will not have the correct cpuid. This broke the avx form of movlps.
The solution is to make sure that def child nodes always come after the cpuid and vendor child nodes, or we can just iterate over the instruction child nodes twice. The latter solution:
diff --git a/scripts/ud_opcode.py b/scripts/ud_opcode.py
index d858b0e..cc20358 100644
--- a/scripts/ud_opcode.py
+++ b/scripts/ud_opcode.py
@@ -591,7 +591,8 @@ class UdOpcodeTables(object):
vendor = node.firstChild.data.split()
elif node.localName == 'cpuid':
cpuid = node.firstChild.data.split()
- elif node.localName == 'def':
+ for node in insnNode.childNodes:
+ if node.localName == 'def':
insnDef = { 'pfx' : [] }
for node in node.childNodes:
if not node.localName:
rdrand isn't correctly decoded:
echo 48 0f c7 f0 | ~/src/udis86/udcli/udcli -64 -x
0000000000000000 480fc7f0 invalid
and this comes up in the kernel:
# define RDRAND_LONG ".byte 0x48,0x0f,0xc7,0xf0"
But I can't find a way to describe this in the xml that doesn't cause overwriting of existing opcodes, it looks like it clashes with the cmpxchg opcodes.
00000000000019b0 f3a7 rep cmpsd
00000000000019b2 f366a7 rep cmpsw
00000000000019b5 f3a6 rep cmpsb
0000000000001a08 f3af rep scasd
0000000000001a0a f366af rep scasw
0000000000001a0d f3ae rep scasb
These should all have "repe" instead of "rep".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.