Code Monkey home page Code Monkey logo

macho-kit's Introduction

What is Mach-O Kit?

Mach-O Kit is an Objective-C framework for parsing Mach-O binaries used by Darwin platforms (macOS, iOS, tvOS, and watchOS). The project also includes a lightweight C library - libMachO - for parsing Mach-O images loaded in the current process.

Mach-O Kit is designed to be easy to use while still exposing all the details of the parsed Mach-O file (if you need them). It can serve as the foundation for anything that needs to read Mach-O files - from a one-off command line tool up to a fully featured interactive disassembler. Most importantly, Mach-O Kit is designed to be safe. Every read operation and its returned data is extensively error checked so that parsing a malformed Mach-O file (even a malicious one) does not crash your program.

Projects Using Mach-O Kit

Getting Started

Mach-O Kit supports macOS 10.10+, iOS 9.0+, and tvOS 9.0+ (and possibly older verions).

Obtaining Mach-O Kit

Use a recursive git clone.

git clone --recursive https://github.com/DeVaukz/MachO-Kit

Installation

  1. Clone the Mach-O repository into your application's repository.
cd MyGreatApp;
git clone --recursive https://github.com/DeVaukz/MachO-Kit
  1. Drag and drop MachOKit.xcodeproj into your application’s Xcode project or workspace.
  2. On the “General” tab of your application target’s settings, add MachOKit.framework to the “Embedded Binaries” section.

Using Mach-O Kit

Before Mach-O Kit can begin parsing a file, you must first create an MKMemoryMap for the file. The memory map is used by the rest of Mach-O Kit to safely read the file's contents. An MKMemoryMap can instead be instantiated with a task port for parsing a Mach-O image loaded in a process that you posses the task port for.

let memoryMap = try! MKMemoryMap(contentsOfFile: URL(fileURLWithPath: "/System/Library/Frameworks/Foundation.framework/Foundation"))

If the file is a FAT binary, Mach-O Kit provides the MKFatBinary class for parsing the FAT header.

let fatBinary = try! MKFatBinary(memoryMap: memoryMap)

# Retrieve the x86_64 slice
let slice64 = fatBinary.architectures.first { $0.cputype == CPU_TYPE_X86_64 }

# Retrieve the offset of the x86_64 slice within the file
let slice64FileOffset = slice64!.offset

You can now instantiate an instance of MKMachOImage. This class is the top-level parser for a Mach-O binary. MKMachOImage requires a memory map and an offset in the provided memory map to begin parsing. For a FAT binary, this is the file offset of the slice you want to parse. For in-process parsing, this is the load address of the Mach-O image which you can retrieve using the dyld_* APIs.

let macho = try! MKMachOImage(name: "Foundation", flags: .init(rawValue: 0), atAddress: mk_vm_address_t(slice64FileOffset), inMapping: memoryMap)

Retrieving Load Commands

Load commands can be retrieved from the loadCommands property of MKMachOImage. Each load command is represented by a instance of an MKLoadCommand subclass.

let loadCommands = macho.loadCommands

print(loadCommands)

Most classes in Mach-O Kit print verbose debug descriptions. MKLoadCommand is no exception.

# The above code outputs:
[
   ...
<MKLCLoadDylib 0x7fa647b36a30; contextAddress = 0x1f38; size = 104> {
	name.offset = 24
	timestamp = 1970-01-01 00:00:02 +0000
	current version = 1.0.0
	compatibility version = 1.0.0
	name = <MKLoadCommandString 0x7fa647b49080; contextAddress = 0x1f50; size = 80> {
		offset = 24
		string = /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration
	}
},
   ...
]

Dependent Libraries

If you just want to inspect the libraries that a Mach-O binary links against, MKLoadCommand includes a dependentLibraries property that returns an array of MKDependentLibrary instances. MKDependentLibrary provides a slightly higher level interface than inspecting the load commands directly.

# Prints the names of all the libraries that Foundation links against
for library in macho.dependentLibraries {
	print(library.value!.name)
}

Objective-C Metadata

Mach-O Kit has complete support for parsing Objective-C metadata. Here is how to print the names of all Objective-C classes in a Mach-O binary:

for (_, section) in macho.sections {
	// Mach-O Kit instantiates specialized subclass of MKSection when it encounters a section containing Objective-C class list metadata
	guard let section = section as? MKObjCClassListSection else { continue }
	
	for clsPointer in section.elements {
		// The __objc_(n)classlist sections are just a list of pointers to class structures in the data section
		guard let cls = clsPointer.pointee.value else { continue}
		// The pointer to the class name is stored in the class data structure
		guard let clsData = cls.classData.pointee.value else { continue }
		// Finally, the name is a pointer to a string in the strings section
		guard let clsName = clsData.name.pointee.value else { continue }
		
		print(clsName)
	}
}

Status

Mach-O Kit currently supports executables, dynamic shared libraries (dylibs and frameworks), and bundles. Parsing for the following are fully implemented or partially implemented:

  • Containers
    • FAT Binary ✔
    • DYLD Shared Cache (needs further testing)
  • Mach-O
    • Header ✔
    • Load Commands ✔ except
      • LC_SYMSEG
      • LC_THREAD
      • LC_UNIXTHREAD
      • LC_LOADFVMLIB
      • LC_IDFVMLIB
      • LC_IDENT
      • LC_FVMFILE
      • LC_PREPAGE
      • LC_PREBOUND_DYLIB
      • LC_SUB_UMBRELLA
      • LC_LINKER_OPTIMIZATION_HINT
    • Segments and Sections ✔
      • Strings Section ✔
      • Pointer List Section ✔
      • Data Section ✔
      • Stubs Section ✔
      • Indirect Pointers Section ✔
    • Rebase Information ✔
      • Commands ✔
      • Fixups ✔
    • Bindings ✔
      • Standard ✔
      • Weak ✔
      • Lazy ✔
      • Threaded ✔ (needs further testing)
    • Exports Information ✔
    • Function Starts ✔
    • Segment Split Info
      • V1 ✔
    • Data in Code Entries ✔
    • Symbols ✔
      • STABS: All stabs can be parsed by Mach-O Kit (because all stabs are symbols). Specialized subclasses with refined API are only provided for the subset of stab types that are emitted by Apple's modern development tools.
      • Undefined Symbols ✔
      • Common Symbols ✔
      • Absolute Symbols ✔
      • Section Symbols ✔
      • Alias Symbols ✔
    • Indirect Symbols ✔
  • ObjC Metadata
    • Image Info ✔
    • Classes ✔
    • Protocols ✔
    • Methods ✔
    • Properties ✔
    • Instance Variables ✔
    • Categories ✔
    • ObjC-Specific Sections
      • __objc_imageinfo
      • __objc_selrefs
      • __objc_superrefs
      • __objc_protorefs
      • __objc_classrefs
      • __objc_classlist / __objc_nlclslist
      • __objc_catlist / __objc_nlcatlist
      • __objc_protolist
      • __objc_ivar
      • __objc_const
      • __objc_data
  • CF Data
    • CFString ✔
    • CF-Specific Sections
      • __cfstring

libMachO

libMachO is a lightweight, C library for safely parsing Mach-O images loaded into a process. You can use libMachO to parse Mach-O images in your own process or any process that your process posses the task port for.

As with Mach-O Kit, access to memory by libMachO is mediated by a memory map. All memory access is checked to prevent parsing a malformed Mach-O image from crashing the parser. Included are memory maps for reading from the current process or from a task port. Any differences between the target architecture of the Mach-O image and the process hosting libMachO are handled transparently.

To keep the library lightweight libMachO overlays itself atop the Mach-O image and provides a set of APIs for reading the underlying Mach-O data structures. libMachO does not build up its own independent representation of the Mach-O image, opting to continuously walk the Mach-O structures to access requested data. A consequence of this design is that libMachO generally expects well-formed Mach-O images.

libMachO does not perform any dynamic memory allocation. Clients are responsible for allocating buffers which are then initialized by the functions called in libMachO. Consequently, the lifetimes of these buffers must be managed by clients.

License

Mach-O Kit is released under the MIT license. See LICENSE.md.

macho-kit's People

Contributors

devaukz avatar k06a avatar kabiroberai avatar milend avatar ryandesign avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

macho-kit's Issues

__cfstring Section

Is the __DATA segment's __cfstring section not implemented? Please excuse me if I'm missing something obvious.

I'd like to parse this section's CFString elements.

Can not find MKDataModel+Layout_Internal.h

hi I just clone repo in my project, and add framwork in genaral, but when i run, xcode report that can not find MKDataModel+Layout_Internal.h, and it is imported by MKNodeFieldCPUType.m, count you tell me what should i do?
is that because my project is not for app?
error reason:
/Users/bytedance/Desktop/oc/MachO-Kit/MachOKit/Shared/Type/CPUType/MKNodeFieldCPUType.m:31:9: 'MKDataModel+Layout_Internal.h' file not found

Modernize the dyld shared cache parser

The dyld shared cache parser has not been given much attention the past few years. Ensure that it can still parse the modern caches. Modernize the implementations of -layout: for all DSC classes.

The node description system should be reworked further

Some problems with the current implementation:

  • It is difficult to (re)construct the C structure layout for a node. This is needed when implementing a disassembler, where you want to display structures overlaid on top of the raw binary data.

  • Node layouts are re-created with each call to -layout. Layouts should be cached. Many layouts are the same for all instances of a node type and could be defined at the class (vs instance) level.

  • The current mechanism for describing bitfields is less than ideal.

  • The type of an MKNodeField is attempting to describe both the semantic type of the field as well as the data type. These should be split.

  • MKNodeFieldOptions should be removed. It was added to simplify the implementation of Mach-O Explorer, with the justification that other viewer apps built with Mach-O Kit could also use it. But its purpose is too narrow. It does not belong in Mach-O Kit.

can i get the category Class, such as name

i read the "__objc_catlist" section data, the i used clsPointer.name, the value is category name, i also want get category cls info.
for example:
// the source code is :

@interface UIImage (AFNetworkingSafeImageLoading)
+ (UIImage *)af_safeImageWithData:(NSData *)data;
@end

static NSLock* imageLock = nil;

@implementation UIImage (AFNetworkingSafeImageLoading)

+ (UIImage *)af_safeImageWithData:(NSData *)data {
    UIImage* image = nil;
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        imageLock = [[NSLock alloc] init];
    });
    
    [imageLock lock];
    image = [UIImage imageWithData:data];
    [imageLock unlock];
    return image;
}

@end

used macho-kit, the result is AFNetworkingSafeImageLoading, but i also want to get the class UIImage, how can i get the info

Handle Chained Fixups

MachO binaries built to target the latest betas (iOS 15/macOS 12/etc) – or, if built for arm64e, even some older versions – replace the old fixup format with a new, more compact, "chained" format. The new format is documented in <mach-o/fixup-chains.h>, and there's also a Medium article that summarises it.

This change, in conjunction with our discussion from #13, makes me think that it might be best if MachO-Kit provided an API to query an MKPointer and "resolve" it (either returning the resolved address if found, or otherwise metadata about the symbol) if it has an associated fixup, to abstract the implementation details of the fixups themselves.

Example of symbols enumeration

Can you provide a short example how to enumerate symbols from export table?

Can't imagine where to get all arguments to mk_string_table_enumerate_strings.

Error on getting String Table

_15358871318763
I'm trying to analysis a MachO file like above. But when I want to get the info of String Table , I got an error said Image does not have a __LINKEDIT segment.. Obviously, there is a __LINKEDIT segment on this file.

Unclear installation instructions

I added the project to my current xcode project and created a new workspace. I opened the workspace and can build my project. However the first line of the example let memoryMap = try! MKMemoryMap(contentsOfFile: URL(fileURLWithPath: "/System/Library/Frameworks/Foundation.framework/Foundation"))

gives me an error:

ld: warning: Auto-Linking supplied '/Users/andermoran/Desktop/Detective-C/MachOKit.framework/MachOKit', file was built for arm64 which is not the architecture being linked (x86_64): /Users/andermoran/Desktop/Detective-C/MachOKit.framework/MachOKit Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_MKMemoryMap", referenced from: objc-class-ref in ViewController.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)

could you clarify how this is supposed to be implemented? Also some Objective-C code example would be nice :)

Handle “Small” ObjC Method Lists

MachO files built to target the latest release OSes (iOS 14/macOS 11/etc) use a new "small" format for ObjC method lists, which contain relative 32-bit offsets to methods instead of absolute addresses. Implementing the relative pointer format could also help with Swift ABI support, since Swift does something similar. For implementation details, check out RelativePointer, method_t, and method_list_t in objc-runtime-new.mm.

Missing include

Looks like context.h should include logging.h to use mk_logger_c at line 50. Right now all works because of current include order.

Improve testing

Most of Mach-O Kit's testing involves using Mach-O Kit to parse every framework on the current host and comparing the output against o-tool. This approach comes with a number of deficiencies:

  • Lack of test coverage against older (or newer, depending on the host) Mach-O binaries.
  • Lack of testing coverage against iOS Mach-O binaries
  • Lack of testing coverage against malformed or malicious binaries.

A handful of binaries from all of the above categories should be selected for a 'core' test pass that can run in CI.

Infinite recursion in core.h when __cplusplus defined

Looks like line #290 in core.h contains infinite recursion cycle, isn't it? This line produces following warning:

libMachO/libMachO/Core/core.h:290:57: All paths through this function will call itself

core.h:290

uint8_t* swap (uint8_t *input, size_t length) const { return swap(input, length); }

Some wrong libMachO includes in MachOKit

Looks like here MachOKit/MachOKit/_MKFileMemoryMap.h:28:

#include <MachOKit/macho.h>

should be:

#include <libMachO/macho.h>

Also should be fixed in:
MachOKit/MachOKit/MachOKit.h:34
MachOKit/MachOKit/SharedTypes/MKBackedNode+Pointer.h:28

Replace @import with #import for support of .mm files

Hi, I've created empty project, loaded MachO-Kit framework in it, and included #import <MachOKit/MachOKit.h> into my .m file.

Everything was fine until I changed extension to .mm.
As I found out, there are some troubles with importing libraries via @import while using C++.
I've been trying a bunch of stuff, e.g. passing flags -fmodules and -fcxx-modules, but nothing helped.

Later, I've found out that the fix is to replace all entries of @import Foundation into #import <Foundation/Foundation.h> in MachO-Kit.
After this changes, everything works fine(it builds and I'm still able to parse MachO files even with .mm extenstion).

Will this change broke something?
If there is no problems with it, could you please apply fix?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.