Code Monkey home page Code Monkey logo

splat's Introduction

splat

PyPI

A binary splitting tool to assist with decompilation and modding projects

Currently, only N64, PSX, and PS2 binaries are supported. More platforms may come in the future.

Please check out the wiki for more information including examples of projects that use splat.

Installing

The recommended way to install is using from the PyPi release, via pip:

python3 -m pip install -U splat64[mips]

The brackets corresponds to the optional dependencies to install while installing splat. Refer to Optional dependencies to see the list of available groups.

If you use a requirements.txt file in your repository, then you can add this library with the following line:

splat64[mips]>=0.23.2,<1.0.0

Optional dependencies

  • mips: Required when using the N64, PSX or PS2 platforms.
  • dev: Installs all the available dependencies groups and other packages for development.

Gamecube / Wii

For Gamecube / Wii projects, see decomp-toolkit!

splat's People

Contributors

angheloalf avatar bates64 avatar berendbutje avatar dragorn421 avatar drahsid avatar ellipticellipsis avatar ethteck avatar gillou68310 avatar hensldm avatar lavos1 avatar lightningth avatar m0liusx avatar marijnvdwerf avatar megamech avatar mkst avatar mr-wiseguy avatar nim-ka avatar sage-of-mirrors avatar simonlindholm avatar sozud avatar unnunu avatar vatuu avatar xeeynamo avatar z64a avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

splat's Issues

Hi/Lo pairs that are far apart aren't generated

For example in this assembly,

/* 48028 80047428 3C06800A */  lui        $a2, 0x800a
/* 4802C 8004742C 24620008 */  addiu      $v0, $v1, 8
/* 48030 80047430 AC820008 */  sw         $v0, 8($a0)
/* 48034 80047434 3C02DF00 */  lui        $v0, 0xdf00
/* 48038 80047438 AC620000 */  sw         $v0, ($v1)
/* 4803C 8004743C AC600004 */  sw         $zero, 4($v1)
/* 48040 80047440 90C35DB8 */  lbu        $v1, 0x5db8($a2)
/* 48044 80047444 3062000F */  andi       $v0, $v1, 0xf
/* 48048 80047448 10400011 */  beqz       $v0, .L80047490
/* 4804C 8004744C 24C45DB8 */   addiu     $a0, $a2, 0x5db8
/* 48050 80047450 306200F0 */  andi       $v0, $v1, 0xf0
/* 48054 80047454 5040000F */  beql       $v0, $zero, .L80047494
/* 48058 80047458 3C04F0FF */   lui       $a0, %hi(D_F0FF0001)
/* 4805C 8004745C 90820001 */  lbu        $v0, %lo(D_F0FF0001)($a0)
/* 48060 80047460 5040000C */  beql       $v0, $zero, .L80047494
/* 48064 80047464 3C04F0FF */   lui       $a0, %hi(D_F0FF0002)
/* 48068 80047468 90820002 */  lbu        $v0, %lo(D_F0FF0002)($a0)
/* 4806C 8004746C 10400009 */  beqz       $v0, .L80047494
/* 48070 80047470 3C04F0FF */   lui       $a0, 0xf0ff
/* 48074 80047474 24C55DB8 */  addiu      $a1, $a2, 0x5db8

Both addius as well as the lbu that reference $a2 should be %lo'd, and the first instruction should be %hi'd

Make code files a class

Todo:

  • Make a code "file" class
    • Maybe named subsection?
  • Implement should_run at the file level
  • Code files' "subtypes" should be proper segment types, maybe as a class member

Manual linker sections in code segments

The proposal is to add a new subsection type that will create a new linker section in the linker ld

- [0x123E0, c]
- [0x12FC0, linker_section, code_lib]
- [0x12FC0, c, libultra/os/initialize]

Detect jumptables during disassembly and emit them during rodata disassembly

per the title, n64splat currently does not detect jumptables, and also just emits raw hexadecimal numbers for the addresses with .word assembler directives. In order to automate the process of making jumptables for use with mips_to_c, n64splat should use the load from a symbol and then subsequent jump to the register that was loaded into to detect jump tables, and then automatically add labels where needed inside the jumptable function.

.bss support

Proposed yaml format:

  - type: code
    start: 0xE0000
    vram: 0x802C3000
    files:
    - [0xe79b0, c, code_e79b0_len_1920]
    - [0xe92d0, c, si]
    - [0xFE650, .data, code_e79b0_len_1920]
    - [0xFE660, data, si]
    - [0xFE730, .rodata, code_e79b0_len_1920]
    - [0xFE748, rodata, si]
    - [0, bss, code_e79b0_len_1920, 0x100] # Creates a code_e79b0_len_1920.bss.s file with a length of 0x100 bss data and adds a linker addition to manually link it in
    - [0, .bss, code_e79b0_len_1920, 0x100] # Uses the .bss section of code_e79b0_len_1920 and sets the length to be 0x100 ```

Pointer style can confuse function name parsing

If I have a function defined as follows:

/* 0x800360F0 */ void *Malloc(struct heap_node *heap, s32 size);

The logic in get_funcs_defined_in_c will consider the asterisk to be part of the name, which causes downstream issues.

This can be avoided by keeping the asterisk from touching the function name. It seems unintentional to have this tool enforce a pointer style though.

Option to skip extraction of portions of rom

It would be useful if splat had the option to mark a given entry in the yaml as a region in the rom to be skipped, instead of extracting it into data. For example, I would use this to integrate a microcode disassembly into a decomp, since I would no longer need to extract the text or data of the given microcode from the rom.

If this feature were added, it would be nice if there was an option to retain the linker script entry for the file, despite the fact that was skipped, with the expectation that the build system would be responsible for producing the given file.

An example of how this feature may look is a line in the yaml like so:

...
    - [0x002160, "bin", "n_aspMain_code"]
    - [0x002DC0, "skip", "f3dex2_code"] 
    - [0x004150, "c", "codeseg1/sched"]
...

The entry marked "skip" would not produce any binary upon running splat, but would still emit the normal linker script entry like so:

...
        BUILD_DIR/bin/n_aspMain_code.o(.data);
        BUILD_DIR/bin/f3dex2_code.o(.data);
        BUILD_DIR/src/codeseg1/sched.o(.text);
...

It would also be useful if the user could indicate what section to use for the given output file, but defaulting to data would most likely cover most cases.

A final suggestion for this feature in regards to the linker entry would be that if a given skipped region has no name, it does not produce a linker script entry.

Rework symbol detection and resolution

We need to simplify and re-think symbols. Random thoughts below:

  • When resolving symbols, the current section should be kept in mind. We need some way to separate symbols by rom address so n64splat can know what is what, especially when identical ram ranges are concerned

  • Clearer heirarchy for resolving pointers in code/data

  • Create a Symbol class that can store information about a symbol such as vram, name, rom, overlay, places and methods accessed, etc

will add more as I think of things

Add --segments cli flag to split given segments only

If supplied a nromal list of segments, split only those.

If supplied any segments with a ! in front of their name, split all segments except for those.

If the supplied list contains "positive" and "negative" segments, bail and tell the user that's not supported.

incbin segment type

This would create an asm file that .incbins from the target and adds a linker entry for this file

Add crc to yaml

Ensure the rom is correct by comparing crc values as a safety check

Consider data types when disassembling data/rodata

as the title implies, n64splat doesn't currently take into account most data types when disassembling data/rodata. For example, the original assembly might be loading a byte, halfword, or even a double, and n64splat won't properly account for this, instead simply opting to use .word for most cases. In my opinion, n64splat should use context from the assembly to properly infer the types.

Symbols are sometimes too large

I recently made some changes so symbols get given the proper type if they fall on an odd byte, but there are now cases where symbols disassemble too much data. We need to create dummy symbols to re-align to even offsets after a symbol ends in an odd place.

Improve C defined function detection / add C declared variable detection

We try to do this, but it clearly has some issues as #11 mentions

Conversely, functions defined in C code but non-matching should be disassembled

  • Run cpp on c files to figure out which functions are defined
  • Improve detection of defined functions
  • Add support for detection of declared (non-externed) variables.
  • Mark these symbols as defined so they don't get written to undefined_syms_auto

pad segment type

This type would just insert 0 bytes (move the linker forward) in the specified range

Library linking in yaml

It would be useful to be able to specify linking files from a library in the splat yaml. Here is one possible example for how it could be done:

- [0x0013A0, .rodata, libultra.a, sched]

This would generate the corresponding line in the linker script

BUILD_DIR/lib/libultra.a:sched.o(.rodata);

Or maybe just

lib/libultra.a:sched.o(.rodata);

I don't know which of those two linker script options would be better in the long run.

Migrate rodata to .s files

This will probably involve doing a pre-split phase that collects things before finally writing them out to disk

Missing label when splitting Aidyn Chronicles

TLDR; bgez $zero, .L800B7B54 but no label at that location.

The following asm is generated (code_B8740.s):

.include "macro.inc"

# assembler directives
.set noat      # allow manual use of $at
.set noreorder # don't insert nops after branches
.set gp=64     # allow use of 64-bit general purpose registers

.section .text, "ax"

glabel func_800B7B40
/* B8740 800B7B40 409AF000 */  mtc0       $k0, $30
/* B8744 800B7B44 3C1A800B */  lui        $k0, 0x800b
/* B8748 800B7B48 275A7D30 */  addiu      $k0, $k0, 0x7d30
/* B874C 800B7B4C 03400008 */  jr         $k0
/* B8750 800B7B50 00000000 */   nop
/*                                     <<=== .L800B7B54: is missing here */
/* B8754 800B7B54 00000000 */  nop
/* B8758 800B7B58 00000000 */  nop
/* B875C 800B7B5C 00000000 */  nop
/* B8760 800B7B60 00000000 */  nop
/* B8764 800B7B64 00000000 */  nop

glabel func_800B7B68
/* B8768 800B7B68 3C1A8000 */  lui        $k0, 0x8000
/* B876C 800B7B6C 375A0194 */  ori        $k0, $k0, 0x194
/* B8770 800B7B70 03400008 */  jr         $k0
/* B8774 800B7B74 00000000 */   nop
/* B8778 800B7B78 409AF000 */  mtc0       $k0, $30
/* B877C 800B7B7C 3C1A800B */  lui        $k0, 0x800b
/* B8780 800B7B80 275A7D30 */  addiu      $k0, $k0, 0x7d30
/* B8784 800B7B84 03400008 */  jr         $k0
/* B8788 800B7B88 00000000 */   nop
/* B878C 800B7B8C 00000000 */  nop
/* B8790 800B7B90 00000000 */  nop
/* B8794 800B7B94 00000000 */  nop
/* B8798 800B7B98 00000000 */  nop
/* B879C 800B7B9C 00000000 */  nop
/* B87A0 800B7BA0 3C1A8000 */  lui        $k0, 0x8000
/* B87A4 800B7BA4 375A0014 */  ori        $k0, $k0, 0x14
/* B87A8 800B7BA8 03400008 */  jr         $k0
/* B87AC 800B7BAC 00000000 */   nop
/* B87B0 800B7BB0 23BDFFF8 */  addi       $sp, $sp, -8
/* B87B4 800B7BB4 AFBF0000 */  sw         $ra, ($sp)
/* B87B8 800B7BB8 3C08800B */  lui        $t0, 0x800b
/* B87BC 800B7BBC 25087B40 */  addiu      $t0, $t0, 0x7b40
/* B87C0 800B7BC0 3C09800B */  lui        $t1, 0x800b
/* B87C4 800B7BC4 25297B54 */  addiu      $t1, $t1, 0x7b54
/* B87C8 800B7BC8 3C0A8000 */  lui        $t2, 0x8000
/* B87CC 800B7BCC 354A0180 */  ori        $t2, $t2, 0x180
/* B87D0 800B7BD0 3C0B800B */  lui        $t3, 0x800b
/* B87D4 800B7BD4 256B7B54 */  addiu      $t3, $t3, 0x7b54
.L800B7BD8:
/* B87D8 800B7BD8 8D4C0000 */  lw         $t4, ($t2)
/* B87DC 800B7BDC 8D0D0000 */  lw         $t5, ($t0)
/* B87E0 800B7BE0 25080004 */  addiu      $t0, $t0, 4
/* B87E4 800B7BE4 AD2C0000 */  sw         $t4, ($t1)
/* B87E8 800B7BE8 25290004 */  addiu      $t1, $t1, 4
/* B87EC 800B7BEC AD4D0000 */  sw         $t5, ($t2)
/* B87F0 800B7BF0 150BFFF9 */  bne        $t0, $t3, .L800B7BD8
/* B87F4 800B7BF4 254A0004 */   addiu     $t2, $t2, 4
/* B87F8 800B7BF8 3C08800B */  lui        $t0, 0x800b
/* B87FC 800B7BFC 25087B78 */  addiu      $t0, $t0, 0x7b78
/* B8800 800B7C00 3C09800B */  lui        $t1, 0x800b
/* B8804 800B7C04 25297B8C */  addiu      $t1, $t1, 0x7b8c
/* B8808 800B7C08 3C0A8000 */  lui        $t2, 0x8000
/* B880C 800B7C0C 3C0B800B */  lui        $t3, 0x800b
/* B8810 800B7C10 256B7B8C */  addiu      $t3, $t3, 0x7b8c
.L800B7C14:
/* B8814 800B7C14 8D4C0000 */  lw         $t4, ($t2)
/* B8818 800B7C18 8D0D0000 */  lw         $t5, ($t0)
/* B881C 800B7C1C 25080004 */  addiu      $t0, $t0, 4
/* B8820 800B7C20 AD2C0000 */  sw         $t4, ($t1)
/* B8824 800B7C24 25290004 */  addiu      $t1, $t1, 4
/* B8828 800B7C28 AD4D0000 */  sw         $t5, ($t2)
/* B882C 800B7C2C 150BFFF9 */  bne        $t0, $t3, .L800B7C14
/* B8830 800B7C30 254A0004 */   addiu     $t2, $t2, 4
/* B8834 800B7C34 0C02DF17 */  jal        func_800B7C5C
/* B8838 800B7C38 00000000 */   nop
/* B883C 800B7C3C 0C02DF1E */  jal        func_800B7C78
/* B8840 800B7C40 00000000 */   nop
/* B8844 800B7C44 2408F7FE */  addiu      $t0, $zero, -0x802
/* B8848 800B7C48 3C01800F */  lui        $at, 0x800f
/* B884C 800B7C4C AC288428 */  sw         $t0, -0x7bd8($at)
/* B8850 800B7C50 8FBF0000 */  lw         $ra, ($sp)
/* B8854 800B7C54 03E00008 */  jr         $ra
/* B8858 800B7C58 23BD0008 */   addi      $sp, $sp, 8
/* B885C 800B7C5C 3C088000 */  lui        $t0, 0x8000
/* B8860 800B7C60 25091FF0 */  addiu      $t1, $t0, 0x1ff0
.L800B7C64:
/* B8864 800B7C64 BD010000 */  cache      1, ($t0)
/* B8868 800B7C68 1509FFFE */  bne        $t0, $t1, .L800B7C64
/* B886C 800B7C6C 25080010 */   addiu     $t0, $t0, 0x10
/* B8870 800B7C70 03E00008 */  jr         $ra
/* B8874 800B7C74 00000000 */   nop
/* B8878 800B7C78 3C088000 */  lui        $t0, 0x8000
/* B887C 800B7C7C 25093FE0 */  addiu      $t1, $t0, 0x3fe0
.L800B7C80:
/* B8880 800B7C80 BD000000 */  cache      0, ($t0)
/* B8884 800B7C84 1509FFFE */  bne        $t0, $t1, .L800B7C80
/* B8888 800B7C88 25080020 */   addiu     $t0, $t0, 0x20
/* B888C 800B7C8C 03E00008 */  jr         $ra
/* B8890 800B7C90 00000000 */   nop
/* B8894 800B7C94 40026000 */  mfc0       $v0, $12
/* B8898 800B7C98 03E00008 */  jr         $ra
/* B889C 800B7C9C 00000000 */   nop
/* B88A0 800B7CA0 40846000 */  mtc0       $a0, $12
/* B88A4 800B7CA4 03E00008 */  jr         $ra
/* B88A8 800B7CA8 00000000 */   nop
/* B88AC 800B7CAC 23BDFFF0 */  addi       $sp, $sp, -0x10
/* B88B0 800B7CB0 AFA40000 */  sw         $a0, ($sp)
/* B88B4 800B7CB4 AFA50004 */  sw         $a1, 4($sp)
/* B88B8 800B7CB8 AFA60008 */  sw         $a2, 8($sp)
/* B88BC 800B7CBC AFA7000C */  sw         $a3, 0xc($sp)
/* B88C0 800B7CC0 40025000 */  mfc0       $v0, $10
/* B88C4 800B7CC4 2004001F */  addi       $a0, $zero, 0x1f
/* B88C8 800B7CC8 3C05800F */  lui        $a1, 0x800f
/* B88CC 800B7CCC 24A58620 */  addiu      $a1, $a1, -0x79e0
.L800B7CD0:
/* B88D0 800B7CD0 40840000 */  mtc0       $a0, $0
/* B88D4 800B7CD4 2084FFFF */  addi       $a0, $a0, -1
/* B88D8 800B7CD8 42000001 */  tlbr
/* B88DC 800B7CDC 00000000 */  nop
/* B88E0 800B7CE0 00000000 */  nop
/* B88E4 800B7CE4 00000000 */  nop
/* B88E8 800B7CE8 00000000 */  nop
/* B88EC 800B7CEC 40062800 */  mfc0       $a2, $5
/* B88F0 800B7CF0 40075000 */  mfc0       $a3, $10
/* B88F4 800B7CF4 ACA60000 */  sw         $a2, ($a1)
/* B88F8 800B7CF8 ACA70004 */  sw         $a3, 4($a1)
/* B88FC 800B7CFC 40061000 */  mfc0       $a2, $2
/* B8900 800B7D00 40071800 */  mfc0       $a3, $3
/* B8904 800B7D04 ACA60008 */  sw         $a2, 8($a1)
/* B8908 800B7D08 ACA7000C */  sw         $a3, 0xc($a1)
/* B890C 800B7D0C 0481FFF0 */  bgez       $a0, .L800B7CD0
/* B8910 800B7D10 20A5FFF0 */   addi      $a1, $a1, -0x10
/* B8914 800B7D14 40825000 */  mtc0       $v0, $10
/* B8918 800B7D18 AFA40000 */  sw         $a0, ($sp)
/* B891C 800B7D1C AFA50004 */  sw         $a1, 4($sp)
/* B8920 800B7D20 AFA60008 */  sw         $a2, 8($sp)
/* B8924 800B7D24 AFA7000C */  sw         $a3, 0xc($sp)
/* B8928 800B7D28 03E00008 */  jr         $ra
/* B892C 800B7D2C 23BD0010 */   addi      $sp, $sp, 0x10
/* B8930 800B7D30 3C1A800F */  lui        $k0, 0x800f
/* B8934 800B7D34 275A80F0 */  addiu      $k0, $k0, -0x7f10
/* B8938 800B7D38 FF410328 */  sd         $at, 0x328($k0)
/* B893C 800B7D3C FF420330 */  sd         $v0, 0x330($k0)
/* B8940 800B7D40 40016800 */  mfc0       $at, $13
/* B8944 800B7D44 8F420338 */  lw         $v0, 0x338($k0)
/* B8948 800B7D48 00010882 */  srl        $at, $at, 2
/* B894C 800B7D4C 3021001F */  andi       $at, $at, 0x1f
/* B8950 800B7D50 10200008 */  beqz       $at, .L800B7D74
/* B8954 800B7D54 00221006 */   srlv      $v0, $v0, $at
/* B8958 800B7D58 30420001 */  andi       $v0, $v0, 1
/* B895C 800B7D5C 14400054 */  bnez       $v0, .L800B7EB0
/* B8960 800B7D60 00000000 */   nop
.L800B7D64:
/* B8964 800B7D64 DF410328 */  ld         $at, 0x328($k0)
/* B8968 800B7D68 DF420330 */  ld         $v0, 0x330($k0)
/* B896C 800B7D6C 0401FF79 */  bgez       $zero, .L800B7B54  /* branch that has no label */
/* B8970 800B7D70 00000000 */   nop
... etc

This results in an error during compilation:

cpp -P -DBUILD_DIR=build -o build/aidyn_chronicles.ld aidyn_chronicles.ld
mips-linux-gnu-ld: build/asm/code_B8740.o: in function `func_800B7B68':
(.text+0x22c): undefined reference to `.L800B7B54'
make: *** [Makefile:69: build/aidyn_chronicles.us.elf] Error 1

Example splat config to help reproduce:

name: Aidyn_Chronicles (North America)
basename: aidyn_chronicles
options:
  find_file_boundaries: True
  compiler: "IDO"
  modes:
  - all
segments:
  - name: header
    type: header
    start: 0x0
    vram: 0
    files:
    - [0x0, header, header]
  - [0x40, bin] # tbd
  - type:  code
    start: 0x00001000
    vram:  0x80000400
    files:
    - [0x00001000, "bin"] # TODO: figure out whats code and whats not
    - [0x00003300, "asm"]
    - [0x3490, "asm"]
    - [0x5080, "asm"]
    - [0x90B0, "asm"]
    - [0x12330, "asm"]
    - [0x15A20, "asm"]
    - [0x16870, "asm"]
    - [0x17700, "asm"]
    - [0x187B0, "asm"]
    - [0x19130, "asm"]
    - [0x1B030, "asm"]
    - [0x1BBB0, "asm"]
    - [0x1BD50, "asm"]
    - [0x1CB10, "asm"]
    - [0x1F9A0, "asm"]
    - [0x20390, "asm"]
    - [0x20A20, "asm"]
    - [0x24B60, "asm"]
    - [0x252F0, "asm"]
    - [0x278D0, "asm"]
    - [0x27950, "asm"]
    - [0x27EC0, "asm"]
    - [0x28E50, "asm"]
    - [0x29E20, "asm"]
    - [0x2BC40, "asm"]
    - [0x2CBE0, "asm"]
    - [0x2E390, "asm"]
    - [0x2EF80, "asm"]
    - [0x2FD00, "asm"]
    - [0x30B60, "asm"]
    - [0x32480, "asm"]
    - [0x32640, "asm"]
    - [0x33190, "asm"]
    - [0x33370, "asm"]
    - [0x357D0, "asm"]
    - [0x35980, "asm"]
    - [0x37490, "asm"]
    - [0x37940, "asm"]
    - [0x3A690, "asm"]
    - [0x3AB10, "asm"]
    - [0x3CF30, "asm"]
    - [0x3D620, "asm"]
    - [0x3EA60, "asm"]
    - [0x3FCE0, "asm"]
    - [0x3FF50, "asm"]
    - [0x40D60, "asm"]
    - [0x41C80, "asm"]
    - [0x42BD0, "asm"]
    - [0x44AF0, "asm"]
    - [0x454A0, "asm"]
    - [0x456B0, "asm"]
    - [0x472A0, "asm"]
    - [0x48EE0, "asm"]
    - [0x49BB0, "asm"]
    - [0x4E880, "asm"]
    - [0x4FB10, "asm"]
    - [0x509E0, "asm"]
    - [0x50AB0, "asm"]
    - [0x52C20, "asm"]
    - [0x53B30, "asm"]
    - [0x543B0, "asm"]
    - [0x558A0, "asm"]
    - [0x56330, "asm"]
    - [0x56780, "asm"]
    - [0x56B90, "asm"]
    - [0x57240, "asm"]
    - [0x57940, "asm"]
    - [0x5C570, "asm"]
    - [0x5F350, "asm"]
    - [0x5F600, "asm"]
    - [0x67FD0, "asm"]
    - [0x68490, "asm"]
    - [0x69B10, "asm"]
    - [0x6AC60, "asm"]
    - [0x6BA50, "asm"]
    - [0x6C570, "asm"]
    - [0x6D5E0, "asm"]
    - [0x6DFD0, "asm"]
    - [0x6E280, "asm"]
    - [0x6E750, "asm"]
    - [0x732C0, "asm"]
    - [0x738E0, "asm"]
    - [0x7A8D0, "asm"]
    - [0x7BB40, "asm"]
    - [0x7C2D0, "asm"]
    - [0x83F80, "asm"]
    - [0x84150, "asm"]
    - [0x84630, "asm"]
    - [0x865B0, "asm"]
    - [0x866B0, "asm"]
    - [0x88790, "asm"]
    - [0x891E0, "asm"]
    - [0x89BA0, "asm"]
    - [0x8A2B0, "asm"]
    - [0x8AB30, "asm"]
    - [0x8B5C0, "asm"]
    - [0x8B890, "asm"]
    - [0x8DDE0, "asm"]
    - [0x8E920, "asm"]
    - [0x8FB20, "asm"]
    - [0x91EA0, "asm"]
    - [0x938A0, "asm"]
    - [0x94320, "asm"]
    - [0x994D0, "asm"]
    - [0x9DB90, "asm"]
    - [0x9E080, "asm"]
    - [0xA12F0, "asm"]
    - [0xA1B70, "asm"]
    - [0xA6150, "asm"]
    - [0xA6300, "asm"]
    - [0xAC690, "asm"]
    - [0xAE100, "asm"]
    - [0xAF280, "asm"]
    - [0xB1070, "asm"]
    - [0xB1250, "asm"]
    - [0xB2230, "asm"]
    - [0xB27F0, "asm"]
    - [0xB3EF0, "asm"]
    - [0xB4DB0, "asm"]
    - [0xB4DC0, "asm"]
    - [0xB4F10, "asm"]
    - [0xB4FB0, "asm"]
    - [0xB4FD0, "asm"]
    - [0xB5B80, "asm"]
    - [0xB6A80, "asm"]
    - [0xB8740, "asm"]
    - [0xB9190, "asm"]
    - [0xB9270, "asm"]
    - [0xB9470, "asm"]
    - [0xB9660, "asm"]
    - [0xB9B90, "asm"]
    - [0xB9C80, "asm"]
    - [0xB9DC0, "asm"]
    - [0xB9DD0, "asm"]
    - [0xB9E20, "asm"]
    - [0xB9E60, "asm"]
    - [0xBA260, "asm"]
    - [0xBA270, "asm"]
    - [0xBA280, "asm"]
    - [0xBA300, "asm"]
    - [0xBA390, "asm"]
    - [0xBA4C0, "asm"]
    - [0xBA780, "asm"]
    - [0xBAD70, "asm"]
    - [0xBADD0, "asm"]
    - [0xBB2F0, "asm"]
    - [0xBB680, "asm"]
    - [0xBBA00, "asm"]
    - [0xBBE30, "asm"]
    - [0xBC8C0, "asm"]
    - [0xBCBF0, "asm"]
    - [0xBCD90, "asm"]
    - [0xBD0B0, "asm"]
    - [0xBD380, "asm"]
    - [0xBDD80, "asm"]
    - [0xBE220, "asm"]
    - [0xBE3E0, "asm"]
    - [0xBE450, "asm"]
    - [0xBE500, "asm"]
    - [0xBE640, "asm"]
    - [0xBE910, "asm"]
    - [0xBECD0, "asm"]
    - [0xBF030, "asm"]
    - [0xBF1A0, "asm"]
    - [0xBF1C0, "asm"]
    - [0xBF2B0, "asm"]
    - [0xBF450, "asm"]
    - [0xBF580, "asm"]
    - [0xBF5D0, "asm"]
    - [0xBF780, "asm"]
    - [0xBF810, "asm"]
    - [0xC01B0, "asm"]
    - [0xC2250, "asm"]
    - [0xC2570, "asm"]
    - [0xC30F0, "asm"]
    - [0xC3190, "asm"]
    - [0xC31B0, "asm"]
    - [0xC3560, "asm"]
    - [0xC3600, "asm"]
    - [0xC36F0, "asm"]
    - [0xC3940, "asm"]
    - [0xC39B0, "asm"]
    - [0xC3A00, "asm"]
    - [0xC3AB0, "asm"]
    - [0xC3B20, "asm"]
    - [0xC3B70, "asm"]
    - [0xC3E70, "asm"]
    - [0xC3FA0, "asm"]
    - [0xC40C0, "asm"]
    - [0xC4180, "asm"]
    - [0xC4210, "asm"]
    - [0xC4450, "asm"]
    - [0xC44F0, "asm"]
    - [0xC46D0, "asm"]
    - [0xC5E60, "asm"]
    - [0xC8320, "asm"]
    - [0xC8490, "asm"]
    - [0xC84D0, "asm"]
    - [0xC8880, "asm"]
    - [0xC8950, "asm"]
    - [0xC89A0, "asm"]
    - [0xC8B10, "asm"]
    - [0xC8B50, "asm"]
    - [0xC8B90, "asm"]
    - [0xC8ED0, "asm"]
    - [0xC8F40, "asm"]
    - [0xC9000, "asm"]
    - [0xC9060, "asm"]
    - [0xC9220, "asm"]
    - [0xC9270, "asm"]
    - [0xC9580, "asm"]
    - [0xC9600, "asm"]
    - [0xC9630, "asm"]
    - [0xCAAE0, "asm"]
    - [0xCACA0, "asm"]
    - [0xCB290, "asm"]
    - [0xCB4E0, "asm"]
    - [0xCB9E0, "asm"]
    - [0xCBA30, "asm"]
    - [0xCC6F0, "asm"]
    - [0xCC840, "asm"]
    - [0xCC960, "asm"]
    - [0xCCCB0, "asm"]
    - [0xCD190, "asm"]
  - [0x0CDAB0, "bin"] # TODO: figure out the rest of the file
  - [0x2000000]

The above is from Version 1 of the ROM, there is also Version 0 which I believe exhibits the same behaviour.

$ python3 tools/n64splat/util/rominfo.py baserom.ver1.us.z64
Image name: AIDYN_CHRONICLES
Country code: E - North America
Libultra version: D
CRC1: 112051D2
CRC2: 68BEF8AC
CIC: 6102 / 7101
RAM entry point: 0x80000400

Header:

.section .header, "a"

.word 0x80371240 /* PI PSD Domain 1 register */
.word 0x0000000F /* Clockrate setting */
.word 0x80000400 /* Entrypoint address */
.word 0x00001444 /* Revision */
.word 0x112051D2 /* Checksum 1 */
.word 0x68BEF8AC /* Checksum 2 */
.word 0x00000000 /* Unknown 1 */
.word 0x00000000 /* Unknown 2 */
.ascii "AIDYN_CHRONICLES    " /* Internal ROM name */
.word 0x00000000 /* Unknown 3 */
.word 0x0000004E /* Cartridge */
.ascii "AY" /* Cartridge ID */
.ascii "E" /* Country code */
.byte 01 /* Version */

Per-file RAM symbols

In Mario Party, the ROM has a pretty detailed table of information about each overlay. I'm submitting this issue to document the additional symbols that would need to be available to C code in order to represent this data.

The struct in Mario Party for overlay data is as follows:

struct overlay_info {
    // ROM offsets:
    u32 rom_start;
    u32 rom_end;

    // RAM addresses:
    u32 ram_start; 
    u32 code_start;
    u32 code_end;
    u32 data_start;
    u32 data_end;
    u32 bss_start;
    u32 bss_end;
};

The existing ld symbols generated cover the first 3 above, but there's currently no symbols for the rest.

I guess basically what I'm proposing is start and end vram symbols for every file, perhaps as an option if it is too verbose for most games.

In the MP1 repo in a manual ld file, I created symbols as follows, as an example.

 .ov054
 {
    build/src/overlays/ov054/main.o(.text);
    build/src/overlays/ov054/dk_jungle_adventure.o(.text);
    __ov054_data_start = .;
    build/src/overlays/ov054/main.o(.data);
    build/src/overlays/ov054/dk_jungle_adventure.o(.data);
    build/src/overlays/ov054/dk_jungle_adventure.o(.rodata);
    __ov054_bss_start = .;
}

(Technically I got by with fewer symbols, observing things like code_end == data_start, but that's probably not the assumption we would want in n64splat.)

For bss, I created a NOLOAD section in order to capture the end of the bss region.

.ov054_bss __ov054_bss_start (NOLOAD) :
{
    build/src/overlays/ov054/dk_jungle_adventure.o(.bss);
    __ov054_bss_end = .;
}

Issues with ascii data in generated data files

I'm trying out the feature to generate *.data.s and *.rodata.s files, since it looks like it will be really helpful. Unfortunately, some .ascii data in the generated files does not lead to matching output with the original ROM when I try this feature.

I think the main/only issue I'm seeing is that a single backslash \ character will not be emitted back into the ROM. I think those aren't getting escaped and are lost.

While not technically an issue, I am also seeing that it considers a lot of unprintable characters to be ascii values. (Values like 0x01, 0x02 etc.) These appear to emit fine, but don't show up in an editor. I am seeing the following bytes being picked up as a string for example, which is entirely unprintable: 00 01 04 05 08 0A 0C 0E 0F 10 01 02 03 06 07 09 0D 0B 11 01. My two cents would be to restrict the use of ascii to a sequence of only basic printable characters.

Parameterise some hardcoded configurations, e.g. undefined_syms.txt

As we may wish to support different versions/regions of ROMs in the same repo, we (may) require different symbol configurations.

Currently the following files are hardcoded in splat:

  • undefined_syms.txt (split.py: undefined_syms_path = os.path.join(repo_path, "undefined_syms.txt"))
  • symbol_addrs.txt (split.py: func_addrs_path = os.path.join(repo_path, "tools", "symbol_addrs.txt"))

As a bonus, it would be nice if the "src" directory was also configurable, as my code is not mature enough to all live inside src/

                if split_file["subtype"] == "c":
                    c_path = os.path.join(
                        base_path, "src", split_file["name"] + "." + self.get_ext(split_file["subtype"]))

all* types

for automatic linker entries for .data, .rodata, .bss

Fix linker output for data/rodata sections.

Currently, splat outputs subsections of types data and rodata as .s files to the asm/data directory.

However, the linker script which gets automatically generated expects the files to be at BUILD_DIR/data and BUILD_DIR/rodata respectively.

It would be easier for the linker script to resolve this than having a specific edge case in Makefiles.

A simple solution for this issue would be making the linker script generator simply point to BUILD_DIR/asm/data for both data and rodata types.

Support custom segments

We should add the ability to define the path to a directory of "extension" (game-specific) segments in the config file.

Better error reporting

Logging has been improved, and we should make errors clearer and easier to understand so problems are easier to fix.

  • Store errors in segments and then check all segments for errors at the end of the split process, informing the user appropriately what failed and why

Branch Likely instructions cause some incorrect %hi/%lo pairs to be generated

It seems that during disassembly, branch likely instructions trip up the hi lo pair generation. In this case for example:

/* 48054 80047454 5040000F */  beql       $v0, $zero, .L80047494
/* 48058 80047458 3C04F0FF */   lui       $a0, %hi(D_F0FF0001)
/* 4805C 8004745C 90820001 */  lbu        $v0, %lo(D_F0FF0001)($a0)

the lui into $a0 is only run if the branch is taken, so it will never be run with the following lbu, meaning the two values are unrelated and should not be treated as a hi/lo pair.

requirements.txt needed

As we have many dependencies on non-standard library python modules, we should be providing a requirements.txt.

Linker file issues with `shiftable: True`

With the BSS updates, I've been able to get a lot closer to being able to edit certain parts of the ROM and support shiftability. I'm running into a couple issues with setting shiftable: True in the splat.yaml though.

  1. The .ld file gives an error undefined symbol __romPos referenced in expression immediately, because the first line is . = __romPos; and __romPos has never been initialized. It could be initialized to zero at the top to fix this.
  2. When non-16 byte alignment is used in a section, the linker_entry sometimes will do _end_block + _begin_segment and create multiple segments. However, it doesn't do __romPos += SIZEOF(.the_previous_segment); between each of these. This leads to an invalid __romPos and some data getting overwritten.

I also don't think the . = __romPos; lines are necessary with shiftable: True, but that doesn't seem to be a functional issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.