Code Monkey home page Code Monkey logo

ghidradostoolbox's Introduction

Ghidra Dos Toolbox

This package contains various scripts to help analyze old DOS programs.

For now this is a script (ResolveDosSyscalls) and an analyzer (DosSyscallAnalyzer) which will map software interrupts (INT) to a function name.

The mapping is contained in a file, and the standard file is x86_msdos6_interrupt_functions.

There is also a Ghidra Data Type (dos_vs6_16) for known DOS structures. This will be included into the program on use.

The function list will most likely never be complete and programs may chain/replace interrupts with functions of their own.

Special notes about DOS programs

Many of the functions uses the CPU Carry Flag (CF) to return error or not. The decompiler may choke on that, because it has no idea how to handle that.

In case of error AX will often contain an error code, regardless of return register. Most functions therefore have AX as a return value, for those situations.

Installation

This will install as a standard Ghidra extension. So download the zip file with the package matching your Ghidra version. In Ghidra choose File -> Install extensions, select the Plus at the top to add a new one and point to the downloaded file.

Experimental loader

I have created an experimental loader, which I think better match the way a DOS program works. You can always choose to use Ghidra's loader instead, but remember that once loaded you have to start over, to load with another loader.

The loader offers options to create fixed memory area Interrupt Vector Table and Bios Area. Those will simply be empty areas if created.

This is mostly meant for my own experiments with Ghidra functionality. This should be handled/added at a later stage, when it is "detected" that the program actually uses those areas.

Changes to loader

  • Create PSP segment before program (not populated)
  • Show the image header as with new PE format.
  • Change the way assumption about CS is done (local to each segment)
  • Don't try to fix segment, just because there is specific byte (RETF) in the first 16 bytes.
  • Fixed position of the ss segment. Commented code to create the stack block, did it correct.

KNOWN BUGS/CAVEAT

Rerunning/reanalyzing with a different/updated interrupt file will give problems as the functions already identified will be kept and reused even though its place move.

There may be problems with pointers. I have not yet fully tested that those are handled properly.

Unused entries in the interrupt list is not checked for errors in return value or parameters.

Wanna help?

If you want to help I will happily accept any contribution in making the interrupt list more complete.

You don't have to make pull requests, just post your updates. Remember any new structs.

Also if you have access to old compiler libraries, tools or other resource used in creating DOS programs, chances are that it would interesting.

Don'ts

  • Don't ask for specific interrupt/functions/subfunctions to be added.
  • Don't report missing interrupt/functions/subfunctions as bugs. It is a feature for you to expand on.
  • Don't comment on bad java code (in your opinion). No one is forcing you to use this and most likely I know already.

The interrupt function file

The interrupt file has the following format separated with space:

  • Interrupt no (in hex)
  • [Register/]Function no (in hex) or --
  • [Register/]Subfunction no (in hex) or -- or ??
  • Name of function
  • Return value of the function or void (no spaces)
  • Parameters to the function or void (no spaces)

The default register for the function code is AH, and subfunction is AL but adding Register/ in front will change it to that register.

The list must be sorted in decreasing order of interrupt, function and subfunction. This is to make allocation of arrays simple, as the program can just allocate the first number (highest number seen).

The value of ?? means that any (other) value will be mapped to this function. Normally used in subfunctions where a number of values map to the same function.

The program will only accept 4095 functions in a list. Should the file grow beyond that, the SYSCALL_SPACE_LENGTH must be increased.

Any function with Terminate in the name will be marked as non returning function.

Return value is defined as <Returntype>[:<Register>]+

As Ghidra only recognize one return value, in case of multiple register return, a struct must be created to contain the values. The order of fields should simply match the order of register in the list.

The parameter list consist of <Name>:<Datatype>[:<Register>]+ or void

Examples

21 5D 0B DosGetSwapableDataAreas R_DATASWAPAREAS:AX:DS:SI void

This is interrupt 21 (DOS functions), function 5D (passed in AH), subfunction 0B (passed in AL) which map to the function DosGetSwapableDataAreas which return a status code and a list

10 12 BL/35 VideoAltFuncSelDisplaySwitchInterface byte:AL new_state:byte:AL,save_area:byte[128]*:ES:DX

This in interrupt 10 (Video functions), function 12 (AH), subfunction 35 (BL) which map to VideoAltFuncSelDisplaySwitchInterface which return a status byte and takes new state to set, and a 128 byte area pointer passed in ES:DX register pair.

And a simple one

20 -- -- DosTerminate void void

Every instance of INT 20 will be directed at DosTerminate, regardless of registers.

Supporting other DOS/Bios versions

To support/handle other DOS versions simply create a file which maps to other functions.

Then copy the script and/or the analyzer and change the name passed to the class doing the work.

TODO:

  • Call fixup at INT locations, to remove swi() disassembly (waiting for new system in Ghidra)
  • Fixed memory regions like screen ram, ROMS and so on.
  • I/O ports. DOS has much more I/O than standard programs to day, even non device drivers.
  • Find common library function from various compilers without access to the compiler.
  • Find common library functions from game companies.
  • Make "assumption" about segment values (CS, DS, ES, SS)
  • Pointer handling together with the assumptions above.
  • DOS4/GW handling and extended memory EMM/EMS
  • Figure out how to make the interrupt file list "selectable".
  • Change the namespace?
  • Make the loader handle load of program in program (emulate DosExecute... for example or for overlays).

ghidradostoolbox's People

Contributors

gravelbones avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ghidradostoolbox's Issues

Handling of CONCAT22 in syscall wrappers

Hi! This is a really awesome and very helpful toolbox, and has made my janky python script to insert 0x21 handlers totally unnecessary!

I was wondering if there was a way to automatically handle stack pointers to syscalls. It's entirely possible I'm doing something incorrect, but when looking at a syscall-wrapping function (which I assume comes from a stdlib of sorts for my executable) I am getting Ghidra disassembly that looks like:


/* WARNING: Unable to track spacebase fully for stack */

word __stdcall FUN_1c6f_3e99(void)

{
  word wVar1;
  byte in_CL;
  undefined2 unaff_BP;
  undefined2 in_SS;
  undefined2 in_DS;
  undefined in_CF;
  
  *(undefined2 *)((short)&stack0x00000000 + -2) = unaff_BP;
  swi(0x21);
  wVar1 = DosOpenFile((char *)CONCAT22(in_DS,*(undefined2 *)((short)&stack0x00000000 + 4)),0,in_CL);
  if ((bool)in_CF) {
    wVar1 = 0xffff;
  }
  return wVar1;
}

It's correctly getting the DosOpenFile reference, but it's unable to see that the second half of the pointer ((short)&stack0x00000000 + 4) is part of the function call. All of the storage for DosOpenFile seems correct, and it's definitely working with that, but somehow the function call isn't matching up and it's trying to turn it into a full char* rather than the custom storage one. (I think?)

I recognize it might be out of the purview of GhidraDosToolbox to address this, but is this an issue you've encountered before and might have an idea of how to solve? Thanks again for your hard work on this, and any tips you might have.

Incompatible with Ghidra 10.2.2

The current ghidra_10.0.4_PUBLIC_20211229_GhidraDosToolbox.zip is not compatible with the current Ghidra version 10.2.2.

What would be needed to create a new compatible version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.