Code Monkey home page Code Monkey logo

asm-differ's People

Contributors

1superchip avatar abaresk avatar angheloalf avatar bates64 avatar cadmic avatar danebou avatar dragorn421 avatar ellipticellipsis avatar ethteck avatar fuuzetsu avatar gamestabled avatar jacobly0 avatar jdflyer avatar leoetlino avatar mc-muffin avatar mkst avatar monsterdruide1 avatar mr-wiseguy avatar n8pjl avatar omniblade avatar onlymx13 avatar pixel-stuck avatar roblabla avatar seekyct avatar simonlindholm avatar snuffysasa avatar sozud avatar synray avatar thar0 avatar zbanks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

asm-differ's Issues

no longer works with Python 3.12

When running diff.py with Python 3.12.1, I get a number of warnings, and it is no longer able to correctly parse the map file.

/diff.py CARDRenameAsync
/home/cameron/Sources/smb-decomp/./diff.py:1240: SyntaxWarning: invalid escape sequence '\S'
  + "(\S+)",
/home/cameron/Sources/smb-decomp/./diff.py:3200: SyntaxWarning: invalid escape sequence '\.'
  if source_line and re.fullmatch(".*\.c(?:pp)?:\d+", source_line):
/home/cameron/Sources/smb-decomp/./diff.py:1024: DeprecationWarning: ast.Num is deprecated and will be removed in Python 3.14; use ast.Constant instead
  if isinstance(node, ast.Num):  # <number>
/home/cameron/Sources/smb-decomp/./diff.py:1025: DeprecationWarning: Attribute n is deprecated and will be removed in Python 3.14; use value instead
  return node.n
Not able to find function in map file.

Gray out branch-likely delay slots that match branch target-1

if both LHS and RHS match the pattern

branchlikely .target
  x

x
.target:

it's worth not highlighting diffs for the first x, because it's just an artifact of an assembler optimization; the real diff is at the second x. We used to do this unconditionally, because IDO only ever emits branch likelies of this form, but with GCC it needs actual pattern matching.

Improve diff algorithm

It sometimes lines blocks up against each other that are quite far away. It feels like it might be operating in a divide-and-conquer fashion that doesn't work well for asm diffing. Something like Levenshtein distance might work better.

Using jal's as markers to line things up against might work, but is kinda specific.

Setting LESS in the environment may cause -w mode to exit prematurely

My environment sets the LESS variable, to LESS=-RXF.
One particular component of this, -F or --quit-if-one-screen, causes less to exit when it's not necessary to page things. I find this very useful and would like to keep it this way.
However, it completely breaks the functionality of diff.py -w when enabled, causing it to always exit.

Considering asm-differ is using less in this rather unconventional manner, I think it's worth fixing in asm-differ. One simple solution is to simply unset the LESS environment variable from within asm-differ, or looking for an option in the less man page to disable the -F behavior.

Handle hex immediates

See the commit message of c1acb7a. These occur in branches and jal's in MIPS objdump output, which we probably need to special-case (?).

Does asm-differ support AltiVec PowerPC 64?

I'm wanting to add PS3 support to decomp.me, but the maintainers said I need to check if asm-differ supports it. Tried to figure it out myself but dont know how to use it correctly.

here's the command I used:

 python3 ../asm-differ/diff.py -f EBOOT.asm start

output:

Traceback (most recent call last):
 File "C:\cygwin\home\693982\asm-differ\diff.py", line 3664, in <module>
   main()
 File "C:\cygwin\home\693982\asm-differ\diff.py", line 3551, in main
   diff_settings.apply(settings, args)  # type: ignore
   ^^^^^^^^^^^^^^^^^^^
AttributeError: module 'diff_settings' has no attribute 'apply'

dunno if im using it wrong, of if asm-differ doesnt support the instruction set. if it doesnt, i would like to know how to add support.

MSVC name mangling seems to cause an exception.

The following asm snippet seems to cause an exception when used within decomp.me where I'm trying to implement msvc support. When the function is changed to no longer be part of a class the exception dissappears and the diff works:

i386-pc-msdosdjgpp-objdump --disassemble --disassemble-zeroes --line-numbers --start-address=0 -m i386 --no-show-raw-insn /mnt/d/Code/teststraw.o

/mnt/d/Code/teststraw.o:     file format coff-go32


Disassembly of section .text:

00000000 <?Get_From@Straw@@EAEXPAV1@@Z>:
   0:   push   %esi
   1:   mov    0x8(%esp),%esi
   5:   push   %edi
   6:   mov    %ecx,%edi
   8:   cmp    %esi,0x4(%edi)
   b:   je     3e <?Get_From@Straw@@EAEXPAV1@@Z+0x3e>
   d:   test   %esi,%esi
   f:   je     26 <?Get_From@Straw@@EAEXPAV1@@Z+0x26>
  11:   mov    0x8(%esi),%ecx
  14:   test   %ecx,%ecx
  16:   je     26 <?Get_From@Straw@@EAEXPAV1@@Z+0x26>
  18:   mov    (%ecx),%eax
  1a:   push   $0x0
  1c:   call   *0x4(%eax)
  1f:   movl   $0x0,0x8(%esi)
  26:   mov    0x4(%edi),%eax
  29:   test   %eax,%eax
  2b:   je     34 <?Get_From@Straw@@EAEXPAV1@@Z+0x34>
  2d:   movl   $0x0,0x8(%eax)
  34:   test   %esi,%esi
  36:   mov    %esi,0x4(%edi)
  39:   je     3e <?Get_From@Straw@@EAEXPAV1@@Z+0x3e>
  3b:   mov    %edi,0x8(%esi)
  3e:   pop    %edi
  3f:   pop    %esi
  40:   ret    $0x4
  43:   nop
  44:   nop
  45:   nop
  46:   nop
  47:   nop
  48:   nop
  49:   nop
  4a:   nop
  4b:   nop
  4c:   nop
  4d:   nop
  4e:   nop
  4f:   nop

Add an API for getting the JSON diff as an object instead of a string

Motivated by decomp.me.

In #52 I added the JsonFormatter for returning the diff results as a JSON blob. This is fine for CLI use, but decomp.me imports diff.py directly and uses it as a library. If we're worried about diff performance in decomp.me, diff.py could expose a function for getting the diff result as a dict, rather than as a serialized str.

SH2 Improvements

I'm running into some issues with data making decompiling SH2 annoying. With SH2, there's only 8-bit immediates, so most data is loaded with pc-relative instructions, and the data is interspersed with the function. This scratch is an example of the sort of issues: https://decomp.me/scratch/SUNET

Lines 18-20 in the asm is actually a jump table.
The 0xB0 in the switch is at line 5A in the asm.
Lines 5C-5E in the asm is the pointer to D_800A7734.

Longer functions have this issue worse since the pc-relative offset is limited so there will be a block instructions and a jump, a block of data, the next block of instructions and a jump, a block of data, etc.

I've been thinking about different ways to solve this issue and was wondering if anyone has suggestions. Do any other architectures have this sort of problem?

I've though about:

Replace objdump with a better disassembler for the SH2 case? My disassembler has gnu as-compatible output but it's written in rust so that would be a significant dependency.
Add more parsing to asm-differ to try the make the objdump output better? I'm not sure exactly how much additional parsing is needed but it seems like it would be basically re-implementing a disassembler for certain patterns.
(Not really asm-differ related) Allow linker arguments in decomp.me so that the pointers can be set to the right locations? This would help with cases like struct->offset where the offset and the base pointer added together by gcc.

Let ".rodata+0x..." match any global name

A bit hacky, but this is useful for seeing differences:

            elif tag == 'replace':
                if line1.split("\t")[0] == line2.split("\t")[0] and '%' in line1 and '%' in line2 and 'rodata' in line2:
                    line1 = f'{original1}'
                    line2 = f'{original2}'
                else:

Line numbers break with mwcceppc dwarf-2

When codewarrior is passed the -sym dwarf-2 flag, a full path is output for the source file name, which includes a colon as part of the drive name and therefore breaks this regex

r"^[^ \t<>:][^\t<>:]*:[0-9]+( \(discriminator [0-9]+\))?$", row

Would allowing colons here break other things?

Custom pager

Implementing our own less equivalent would make it possible to:

  • refresh the display without flicker
  • keep scroll position
  • capture keystrokes/mouse clicks, so we can add new functionality:
    • highlighting individual registers on click
    • switching between the last K revisions with arrow keys

Would need to support {,page }{up,down}, g, G, q, and probably (annoyingly) search (/, n, N); I don't think the rest of less is useful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.