Code Monkey home page Code Monkey logo

ruby-macho's Introduction

ruby-macho

Gem Version CI Coverage Status

A Ruby library for examining and modifying Mach-O files.

What is a Mach-O file?

The Mach-O file format is used by macOS and iOS (among others) as a general purpose binary format for object files, executables, dynamic libraries, and so forth.

Installation

ruby-macho can be installed via RubyGems:

$ gem install ruby-macho

Documentation

Full documentation is available on RubyDoc.

A quick example of what ruby-macho can do:

require 'macho'

file = MachO::MachOFile.new("/path/to/my/binary")

# get the file's type (object, dynamic lib, executable, etc)
file.filetype # => :execute

# get all load commands in the file and print their offsets:
file.load_commands.each do |lc|
  puts "#{lc.type}: offset #{lc.offset}, size: #{lc.cmdsize}"
end

# access a specific load command
lc_vers = file[:LC_VERSION_MIN_MACOSX].first
puts lc_vers.version_string # => "10.10.0"

What works?

  • Reading data from x86/x86_64/PPC Mach-O files
  • Changing the IDs of Mach-O and Fat dylibs
  • Changing install names in Mach-O and Fat files
  • Adding, deleting, and modifying rpaths.

What needs to be done?

  • Unit and performance testing.

Contributing, setting up overcommit and the linters

In order to keep the repo, docs and data tidy, we use a tool called overcommit to connect up the git hooks to a set of quality checks. The fastest way to get setup is to run the following to make sure you have all the tools:

gem install overcommit bundler
bundle install
overcommit --install

Attribution

License

ruby-macho is licensed under the MIT License.

For the exact terms, see the license file.

ruby-macho's People

Contributors

apainintheneck avatar bo98 avatar branchvincent avatar brewtestbot avatar carlocab avatar dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar issyl0 avatar jonchang avatar leonklingele avatar mikemcquaid avatar mistydemeo avatar moisan avatar p-linnane avatar rickmark avatar rylan12 avatar uniqmartin avatar woodruffw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ruby-macho's Issues

MachOView needs to support lazy loading for content

As we start to parse contents of various regions, it would be helpful to not hold in memory @raw_data but materialize as needed in MachOView.

This can be accomplished by using the new file reference and seek / read to pull in the data lazy

Changing install name can break the binary

MachOFile#change_install_name corrupts load commands in the sense that it doesn't preserve the type of the command when changing the name. This will happen rarely, but wen it does, it usually changes semantics and thus breaks the binary. (Sorry I missed this earlier in review.)

Steps to reproduce:

$ echo "void foo() {}" > lib.c
$ clang -dynamiclib -o libtest.dylib -Wl,-reexport-lz lib.c
$ otool -l libtest.dylib > otool-old.log
$ ruby -Ilib -rmacho -e 'MachO::Tools.change_install_name("libtest.dylib", "/usr/lib/libz.1.dylib", "does_not_matter")'
$ otool -l libtest.dylib > otool-new.log

Notice how LC_REEXPORT_DYLIB has changed to LC_LOAD_DYLIB:

diff --git 1/otool-old.log 2/otool-new.log
index 55435787..62636422 100644
--- 1/otool-old.log
+++ 2/otool-new.log
@@ -109,9 +109,9 @@ Load command 8
   cmdsize 16
   version 0.0
 Load command 9
-          cmd LC_REEXPORT_DYLIB
-      cmdsize 48
-         name /usr/lib/libz.1.dylib (offset 24)
+          cmd LC_LOAD_DYLIB
+      cmdsize 40
+         name does_not_matter (offset 24)
    time stamp 2 Thu Jan  1 01:00:02 1970
       current version 1.2.5
 compatibility version 1.0.0

cc @woodruffw

1.1 release

Copied over from 1.0 (didn't make the cut):

  • Broader LC creation/serialization support (important)
  • More fully featured command-line utilities (nice-to-have)

Some new ideas:

  • I/O and object allocation optimization
  • Expansive to_hash/to_h support.
  • Refactoring MachOStructure to behave more like a DSL (no breaking changes) (see #70)
  • Fat file merging (similar to lipo(1)) (#68)
  • Aggressive delegation and documentation simplification (#69)

Fat parser should fail on mismatched CPU types

Currently, ruby-macho will happily parse a fat Mach-O whose fat_archs and internal slices have mismatching (i.e., not 1-to-1) CPU types and subtypes.

Reference this LLVM object: https://github.com/llvm-mirror/llvm/blob/master/test/Object/Inputs/macho-invalid-fat_cputype

Observed behavior:

>> MachO.open('macho-invalid-fat_cputype')
#<MachO::FatFile:0x000055776df7b0e0 @filename="macho-invalid-fat_cputype", @options={}, @raw_data="\xCA\xFE\xBA\xBE\x00\x00\x00\x01\x00\x00\x00\f\x00\x00\x00\x00\x00\x00\x00\x1C\x00\x00\x00\x1C\x00\x00\x00\x02\xCE\xFA\xED\xFE\a\x00\x00\x00\x03\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", @header=#<MachO::Headers::FatHeader:0x000055776df7af50 @magic=3405691582, @nfat_arch=1>, @fat_archs=[#<MachO::Headers::FatArch:0x000055776df7ae10 @cputype=12, @cpusubtype=0, @offset=28, @size=28, @align=2>], @machos=[#<MachO::MachOFile:0x000055776df7ad20 @filename=nil, @options={}, @raw_data="\xCE\xFA\xED\xFE\a\x00\x00\x00\x03\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", @endianness=:little, @header=#<MachO::Headers::MachHeader:0x000055776df7ab18 @magic=4277009102, @cputype=7, @cpusubtype=3, @filetype=1, @ncmds=0, @sizeofcmds=0, @flags=0>, @load_commands=[]>]>

Expected behavior:

An exception.

New load commands: LC_DYLD_EXPORTS_TRIE and LC_DYLD_CHAINED_FIXUPS

Apple snuck these in on us at some point. They're 0x33 and 0x34 (masked with LC_REQ_DYLD), respectively, and should be mapped to LinkeditDataCommand

#define LC_DYLD_EXPORTS_TRIE (0x33 | LC_REQ_DYLD) /* used with linkedit_data_command, payload is trie */
#define LC_DYLD_CHAINED_FIXUPS (0x34 | LC_REQ_DYLD) /* used with linkedit_data_command */

Dependabot can't resolve your Ruby dependency files

Dependabot can't resolve your Ruby dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

Bundler::VersionConflict with message: Bundler could not find compatible versions for gem "ruby":
  In Gemfile:
    ruby (~> 2.0.0.0)

    rubocop (<= 0.57.0, >= 0.56.0) was resolved to 0.57.0, which depends on
      ruby (>= 2.1.0)

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

You can mention @dependabot in the comments below to contact the Dependabot team.

`FatArch64`

It looks like Apple decided to introduce a new fat_arch without telling anybody:

/*
 * The support for the 64-bit fat file format described here is a work in
 * progress and not yet fully supported in all the Apple Developer Tools.
 *
 * When a slice is greater than 4mb or an offset to a slice is greater than 4mb
 * then the 64-bit fat file format is used.
 */
#define FAT_MAGIC_64	0xcafebabf
#define FAT_CIGAM_64	0xbfbafeca	/* NXSwapLong(FAT_MAGIC_64) */

struct fat_arch_64 {
	cpu_type_t	cputype;	/* cpu specifier (int) */
	cpu_subtype_t	cpusubtype;	/* machine specifier (int) */
	uint64_t	offset;		/* file offset to this object file */
	uint64_t	size;		/* size of this object file */
	uint32_t	align;		/* alignment as a power of 2 */
	uint32_t	reserved;	/* reserved */
};

Based on the comment, we'll probably need to support it soon.

Source: https://opensource.apple.com/source/cctools/cctools-895/include/mach-o/fat.h.auto.html

Constant Reorganization

Right now, virtually all constants in ruby-macho are exposed directly in the MachO namespace.

This isn't an issue per se, but leaves the module's namespace awfully cluttered and doesn't properly isolate concerns between different components (i.e. header constants, load command constants, section constants, etc).

I have some proposals for approaching this:


  • Put all constants in one big Constants module.
    • Pros: Extremely simple. Virtually no changes to program structure.
    • Cons: Ugly. No isolation of concerns. Basically hides the problem in a slightly deeper namespace.

  • Consolidate only categories of constants into separate modules. Under this, magic numbers would go under HeaderConstants, load command constants would go under LoadCommandConstants, and so forth. Each module could then be included in relevant locations to avoid the clutter of LoadCommandConstants::LOAD_COMMANDS, etc.
    • Pros: Simple, and doesn't interfere too much with overall program structure. Most things stay the way they are.
    • Cons: Duplicates layout in a separate tree. Under this, the LoadCommand class won't have any direct namespace relationship to LoadCommandConstants. This might be unappealing.

  • Consolidate both constants and classes into modules. Under this, both magic numbers and headers would go under Headers. This could then be taken a step further and be made into Headers, and Headers::Constants (and so on for load commands, sections, etc.).
    • Pros: Maintains a close relationship between related constants and classes. Maximally explicit in structure. Closest to "standard" practice.
    • Cons: Time intensive, verbose. Would make constant access for users more difficult (this might be a pro). Requires significant internal and public-facing changes (e.g. DylibCommand becomes LoadCommands::DylibCommand).

  • Leave everything the way it is.
    • Pros: Extremely simple and works.
    • Cons: Clutters up the primary namespace, making interactive use (tab-complete in particular) a nightmare. Makes the generated documentation messy. Scorns the standard practice of keeping constants in the primary namespace to a minimum.

I am personally leaning towards the third option. It's the most work-intensive of all four and will likely require significant changes to the tests, but gives us a much nicer future layout to work with.

I'm sure I've missed one (or several) good alternatives as well, so give me your thoughts on these and tell me anything you come up with!

cc @UniqMartin

rpath deletion fails with duplicate rpaths

pdnsrec in Homebrew/core ships a binary with duplicate rpaths:

❯ otool -l pdns_recursor | rg -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 40
         path /usr/local/opt/boost/lib (offset 12)
--
          cmd LC_RPATH
      cmdsize 40
         path /usr/local/opt/boost/lib (offset 12)
--
          cmd LC_RPATH
      cmdsize 40
         path /usr/local/opt/boost/lib (offset 12)

Trying to delete this rpath fails with the following error:

[5] brew(main)> MachO::Tools.delete_rpath("pdns_recursor", "/usr/local/opt/boost/lib")
MachO::LoadCommandError: Unrecognized Mach-O load command: 0x00
from /usr/local/Homebrew/Library/Homebrew/vendor/bundle/ruby/2.6.0/gems/ruby-macho-2.5.0/lib/macho/macho_file.rb:530:in `block in populate_load_commands'

This is causing a bottling failure in Homebrew/core at Homebrew/homebrew-core#77263.

Serialization of artificially created load commands

One eventual goal of ruby-macho is to be able to add new load commands to a Mach-O file, not just modify the already present ones.

We want to make this sort of operation as painless as possible, so it might make sense to create a system where load commands and other structures can be constructed manually and serialized into binary strings. For example:

lc = VersionMinCommand.serialize(version, sdk) # => "\x24\x08..."
macho.add_command(lc) # or maybe 'add_lc'?

I'm open to other structure/layout ideas, that one's just a rough sketch.

cc @UniqMartin

Incremental reads for MachOFile and FatFile

Right now, MachOFile.new and FatFile.new read entire binaries into memory. This is efficient when
manipulating their contents, but is unnecessarily expensive when testing the file's sanity
(good magic, reasonable size, etc). As a result, testing large numbers of Mach-O files with exception
handling is unnecessarily slow (when using MachOFile or FatFile directly).
#22 circumvents this problem when using the generic MachO.open method, but the Mach-O type classes should also do this individually.

I'm assigning this to myself, but it's not particularly high on the priority list (Homebrew uses MachO.open only).

Crash after read bin-file from ipa

Hi. I read bin file from IPA bild
MachO::MachOFile.new("/path/to/Payload/BuildTest.app/BuildTest")
and don't initialize, having exeption -
"MachO::FatBinaryError: Fat binaries must be loaded with MachO::FatFile"
I'm using Ubuntu 14 and ruby 1.9.3/2.2.1

Duplicated LC modification logic

Currently, MachOFile#change_rpath allows a duplicate rpath to be created, since it does not check the names of rpaths already present in the file.

This behavior isn't incorrect in the sense that it's also what install_name_tool does, but it's unintuitive and will likely confuse users who expect a given rpath to occur only once. More to the point, following install_name_tools behavior will allow a user to create a binary that triggers another unintuitive behavior in install_name_tool (namely that rpath deletion only deletes the first matching rpath, not all matching rpaths).

ref #40 (comment)
cc @UniqMartin

Dependabot can't resolve your Ruby dependency files

Dependabot can't resolve your Ruby dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

Bundler::VersionConflict with message: Bundler could not find compatible versions for gem "ruby":
  In Gemfile:
    ruby (~> 2.1.10.0)

    rubocop (<= 0.58.1, >= 0.58.0) was resolved to 0.58.1, which depends on
      ruby (>= 2.2.0)

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

You can mention @dependabot in the comments below to contact the Dependabot team.

Mergable Libraries are unsupported

Hello!

Apple has introduced mergable libraries with Xcode 15.

As I understand them, they are dynamic frameworks but come with additional metadata allowing them to be statically merged.

It also comes with a new load command, LC_ATOM_INFO (code 0x36) https://github.com/llvm/llvm-project/blob/82c5d350d200ccc5365d40eac187b9ec967af727/llvm/include/llvm/BinaryFormat/MachO.def#L80

This library doesn't understand this code and fails parsing the binary.
I'm not sure what you could to with it or what you could expose, I just thought I'd let you know as it breaks parsing altogether.

Have a nice day,
Arnaud

PS: I noticed this issue via CocoaPods, which still uses version 2.5 of this library. So, I don't think there's a need to rush this.

Fails to parse parse dynamic framework if it contains new arm64_32 arch slice

Apple has added a new arm64_32 arch for WatchOS 5.0 SDK (Xcode 10.0). If a a dynamic framework has this arch, ruby-macho is not able to read it. This is breaking Cocoapod project because it thinks that framework is static and not dynamic. But this framework is a valid dynamic framework.

$> file openssl.framework/openssl

openssl.framework/openssl: Mach-O universal binary with 3 architectures: [i386:Mach-O dynamically linked shared library i386] [arm64_32_v8]
openssl.framework/openssl (for architecture i386):	Mach-O dynamically linked shared library i386
openssl.framework/openssl (for architecture armv7k):	Mach-O dynamically linked shared library arm_v7k
openssl.framework/openssl (for architecture cputype (33554444) cpusubtype (1)):	Mach-O dynamically linked shared library arm64_32_v8

$> lipo -info openssl.framework/openssl 

Architectures in the fat file: openssl.framework/openssl are: i386 armv7k arm64_32 

macho library fails to read this framework.

Test Script

require 'macho'
  
file = MachO.open("openssl.framework/openssl")
puts file.filetype

This throws an error.

~/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/macho_file.rb:457:in `check_cputype': Unrecognized CPU type: 0x0200000c (MachO::CPUTypeError)
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/macho_file.rb:429:in `populate_mach_header'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/macho_file.rb:234:in `populate_fields'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/macho_file.rb:55:in `initialize_from_bin'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/macho_file.rb:33:in `new_from_bin'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/fat_file.rb:324:in `block in populate_machos'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/fat_file.rb:323:in `each'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/fat_file.rb:323:in `populate_machos'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/fat_file.rb:124:in `populate_fields'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho/fat_file.rb:63:in `initialize'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho.rb:31:in `new'
	from /Users/shekhar.suman/.rvm/gems/ruby-2.4.0@global/gems/ruby-macho-1.1.0/lib/macho.rb:31:in `open'
	from temp.rb:4:in `<main>'

I have tested, if framework does not include arm64_32 slice, then it successfully reads the framework.
I have attached openssl.framework.zip so you can test on your end. I'm on MacOS 10.13.6.

Dependabot can't resolve your Ruby dependency files

Dependabot can't resolve your Ruby dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

Bundler::VersionConflict with message: Bundler could not find compatible versions for gem "url":
  In Gemfile:
    codecov was resolved to 0.1.14, which depends on
      url

Could not find gem 'url', which is required by gem 'codecov', in any of the sources.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

You can mention @dependabot in the comments below to contact the Dependabot team.

Implement `LC_NOTE`

I'll probably do this tomorrow, but just to keep track:

struct note_command {
       uint32_t cmd;        // LC_NOTE
       uint32_t cmdsize;    // sizeof(struct note_command)
       char data_owner[16]; // owner name for this LC_NOTE
       uint64_t offset;     // file offset of this data
       uint64_t size;       // length of data region
};

Pure-Ruby Code Directory/Code signing parsing and manipulation

I'm going to use this issue as a dumping ground as I explore a pure-Ruby alternative to #260.

At a high level:

  • If a binary already contains an LC_CODE_SIGNATURE, we need to erase it and replace it with our own (ad-hoc) signature
  • If a binary doesn't contain an LC_CODE_SIGNATURE, we need to add a new load command containing one

That's not the end of things:

  • LC_CODE_SIGNATURE references the signing data, but doesn't actually contain it. It's actually hiding in the __LINKEDIT segment. That means that we'll need to rewrite (and probably resize) __LINKEDIT.

Add a RubyGems publishing workflow

I currently release new versions of the ruby-macho gem from my desktop. This isn't ideal, both security wise and in terms of availability for other Homebrew maintainers. So, we should use GitHub Actions to automatically publish releases instead.

Some notes:

  • It looks like gem push can use GEM_HOST_API_KEY in the environment to get a RubyGems API key

Better stringification for load commands.

Right now, every load command (e.g., MachO::LoadCommand::DylibCommand just returns its type when stringified:

dylib.to_s # => "LC_DYLIB_ID"

This is a good default for load commands that don't have a nice string representation, but we can do better for dylib commands, rpaths, versions, etc:

dylib.to_s # => "/usr/lib/whatever.dylib"
rpath.to_s # => "/var/lib/foo"
uuid.to_s # => "4ecc2bd4-4c76-4c94-98b3-a650f0e5af17"

This will also eliminate the need to do dylib.name.to_s and similar chains when extracting strings from LCStr instances within load commands.

This changes the current stringification behavior substantially (even if it doesn't break the CI), so it's probably best for a 2.x release.

Treatment of discrepancies in Mach-O slices of fat binaries

As has been raised in Homebrew/brew#592 (comment) and subsequent comments, test bots running ruby-macho fail to bottle alot from PR Homebrew/homebrew-core#1663:

$ brew bottle --verbose --debug --json alot
/opt/brewery/dummy/Library/Homebrew/brew.rb (Formulary::TapLoader): loading /opt/brewery/dummy/Library/Taps/homebrew/homebrew-core/Formula/alot.rb
==> Determining alot bottle revision...
==> Bottling alot-0.3.7.el_capitan.bottle.tar.gz...
Changing install name in /opt/brewery/dummy/Cellar/alot/0.3.7/libexec/lib/python2.7/site-packages/gpgme/_gpgme.so
  from /opt/brewery/dummy/opt/gpgme/lib/libgpgme.11.dylib
    to @@HOMEBREW_PREFIX@@/opt/gpgme/lib/libgpgme.11.dylib
Error: No such dylib name: /opt/brewery/dummy/opt/gpgme/lib/libgpgme.11.dylib
/opt/brewery/dummy/Library/Homebrew/vendor/macho/macho/macho_file.rb:240:in `change_install_name'
/opt/brewery/dummy/Library/Homebrew/vendor/macho/macho/fat_file.rb:171:in `block in change_install_name'
/opt/brewery/dummy/Library/Homebrew/vendor/macho/macho/fat_file.rb:170:in `each'
/opt/brewery/dummy/Library/Homebrew/vendor/macho/macho/fat_file.rb:170:in `change_install_name'
/opt/brewery/dummy/Library/Homebrew/vendor/macho/macho/tools.rb:33:in `change_install_name'
/opt/brewery/dummy/Library/Homebrew/os/mac/ruby_keg.rb:13:in `change_install_name'
[…snip…]

The reason turns out to be that the problematic file is a fat binary, but not all of its slices link to the mentioned dylib (most likely because the dylib in question only has a single architecture):

$ otool -arch all -L /opt/brewery/dummy/Cellar/alot/0.3.7/libexec/lib/python2.7/site-packages/gpgme/_gpgme.so
/opt/brewery/dummy/Cellar/alot/0.3.7/libexec/lib/python2.7/site-packages/gpgme/_gpgme.so (architecture i386):
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
/opt/brewery/dummy/Cellar/alot/0.3.7/libexec/lib/python2.7/site-packages/gpgme/_gpgme.so (architecture x86_64):
    /opt/brewery/dummy/opt/gpgme/lib/libgpgme.11.dylib (compatibility version 26.0.0, current version 26.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

$ otool -arch all -L /opt/brewery/dummy/opt/gpgme/lib/libgpgme.11.dylib
/opt/brewery/dummy/opt/gpgme/lib/libgpgme.11.dylib:
    /opt/brewery/dummy/opt/gpgme/lib/libgpgme.11.dylib (compatibility version 26.0.0, current version 26.0.0)
    /opt/brewery/dummy/opt/libassuan/lib/libassuan.0.dylib (compatibility version 8.0.0, current version 8.3.0)
    /opt/brewery/dummy/opt/libgpg-error/lib/libgpg-error.0.dylib (compatibility version 20.0.0, current version 20.1.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

@tdsmith This didn't happen before the formula in the PR was switched over to the virtualenv system. It strikes me as broken that this is creating extensions that pretend to be universal, but the 32-bit slice is probably dysfunctional. Can you take a closer look at the Python/virtualenv side of things and check what the correct behavior should be?

My conclusion for ruby-macho is that we need to handle these situations more resiliently. It's unusual, but perfectly legal, for different Mach-O slices of a fat binary to have a different set of dylibs they link to. install_name_tool seems to handle this correctly, thus we should do that, too. (Though we should continue being a bit more strict within reasonable bounds.)

TL;DR/practical suggestion: By default, consider a modification to a fat binary a success when it succeeds for one of its Mach-O slices. Optionally offer a strict mode that would require the change to succeed for all slices (current behavior), e.g. via:

file.change_install_name(old_path, new_path, :strict => true)

cc @woodruffw

Fat parser should fail on file with 0 architectures

Currently, ruby-macho will happily parse a fat Mach-O with no internal slices (i.e., nfat_arch == 0). This should be an error (potentially an ignorable one with permissive: true).

Reference this LLVM object: https://github.com/llvm-mirror/llvm/blob/master/test/Object/Inputs/macho-invalid-fat-header

Observed behavior:

>> MachO.open('macho-invalid-fat-header')
#<MachO::FatFile:0x000055776df6b2a8 @filename="macho-invalid-fat-header", @options={}, @raw_data="\xCA\xFE\xBA\xBE\x00\x00\x00\x00", @header=#<MachO::Headers::FatHeader:0x000055776df6b118 @magic=3405691582, @nfat_arch=0>, @fat_archs=[], @machos=[]>

Expected behavior:

An exception.

More permissive parsing of unknown load commands

ruby-macho currently throws an exception (LoadCommandError) when it can't resolve a load command type to a class.

To prevent the future addition of load commands by Apple from fouling up the parser, we could allow the user to toggle more permissive behavior:

  • If the load command ID corresponds to a known class, instantiate that class
  • If the load command doesn't correspond to a known class, instantiate the generic LoadCommand class

We already hard-code this behavior for LC_PREPAGE, since Apple doesn't provide a public structure for it. This behavior could be made generic fairly easily.

Expanding Makefile/build system to create inconsistent fat binaries

I'm currently looking at the Makefile to see what our best approach is to creating inconsistent fat binaries (i.e., the kind that will trigger the new behavior in #55).

In order to lipo together two inconsistent binaries of different architectures, we need a fat-only target that can generically depend on two single-arch-only targets. That's not possible with the current layout of the Makefile, and probably won't be without some significant changes.

For the time being, it might be worthwhile to externalize this process to a shell script, one that's called by the makefile at the very end of the all: chain. With a few minor tweaks, I can provide that script with the information it needs to access the correct single-arch directories and drop the lipo'd results in the correct fat-arch directories.

cc @UniqMartin

`change_rpath` differs in behaviour from `install_name_tool` when handling duplicates

Suppose I compile a dylib with a duplicate RPATH:

❯ clang -xc /dev/null -shared -rpath dupe -rpath dupe -o libdupes.dylib
❯ otool -l libdupes.dylib | rg -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 24
         path dupe (offset 12)
--
          cmd LC_RPATH
      cmdsize 24
         path dupe (offset 12)

(Admittedly, I don't know why one would do this, but pdnsrec still does.)

change_rpath handles changing one RPATH just fine, but chokes on the second:

❯ brew ruby -e 'MachO::Tools.change_rpath("libdupes.dylib", "dupe", "notdupe")'
❯ otool -l libdupes.dylib | rg -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 24
         path notdupe (offset 12)
--
          cmd LC_RPATH
      cmdsize 24
         path dupe (offset 12)
❯ brew ruby -e 'MachO::Tools.change_rpath("libdupes.dylib", "dupe", "notdupe")'
/usr/local/Homebrew/Library/Homebrew/vendor/bundle/ruby/2.6.0/gems/ruby-macho-3.0.0/lib/macho/macho_file.rb:388:in `change_rpath': notdupe already exists (MachO::RpathExistsError)
        from /usr/local/Homebrew/Library/Homebrew/vendor/bundle/ruby/2.6.0/gems/ruby-macho-3.0.0/lib/macho/tools.rb:60:in `change_rpath'
        from -e:1:in `<main>'

install_name_tool chugs along just fine, though:

❯ install_name_tool -rpath dupe notdupe libdupes.dylib
❯ otool -l libdupes.dylib | rg -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 24
         path notdupe (offset 12)
--
          cmd LC_RPATH
      cmdsize 24
         path notdupe (offset 12)

This is slightly related to changes we made in #362 and #366, and I think I realised this shortly after, but this slipped off my radar.

It might be that this is intentional and nothing needs changing here, but I just wanted to bring this up just in case you want ruby-macho to behave more look install_name_tool here.

`MachO::MachOFile#delete_rpath` does not work

If you compile a dylib, say libfoo.dylib and open it using

file = MachO::MachOFile.new("libfoo.dylib")

and then use file.delete_rpath to delete an LC_RPATH entry, nothing happens. I do get a large amount of text in my terminal, but that's from delete_rpath dumping raw_data entries in my terminal.

On the other hand, using MachO::Tools.delete_rpath does seem to work.

I'll try to dig into why one seems to be working when the other doesn't.

Unrecognized Mach-O magic

I'm maintaining homebrew tap for universal python and need to keep universal openssl as well. Several times I attempted and failed to to switch from using lipo to MachO::Tools.merge_machos, but this time trying to figure out what I am doing wrong.

With lipo --create everything works as expected:

$ lipo -create /tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-i386/libcrypto.1.0.0.dylib /tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-x86_64/libcrypto.1.0.0.dylib -output /tmp/xyz.dylib
$ lipo -info /tmp/xyz.dylib
Architectures in the fat file: /tmp/xyz.dylib are: i386 x86_64
$ file /tmp/xyz.dylib
/tmp/xyz.dylib: Mach-O universal binary with 2 architectures: [i386:Mach-O dynamically linked shared library i386] [x86_64]
/tmp/xyz.dylib (for architecture i386):	Mach-O dynamically linked shared library i386
/tmp/xyz.dylib (for architecture x86_64):	Mach-O 64-bit dynamically linked shared library x86_64

and

$ file /tmp/zzz.a
/tmp/zzz.a: Mach-O universal binary with 2 architectures: [i386:current ar archive random library] [x86_64]
/tmp/zzz.a (for architecture i386):	current ar archive random library
/tmp/zzz.a (for architecture x86_64):	current ar archive random library
$ lipo -info /tmp/zzz.a
Architectures in the fat file: /tmp/zzz.a are: i386 x86_64

When I launch brew irb:

irb(main):004:0> libs = ["/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-i386/libcrypto.1.0.0.dylib","/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-x86_64/libcrypto.1.0.0.dylib"]
=> ["/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-i386/libcrypto.1.0.0.dylib", "/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-x86_64/libcrypto.1.0.0.dylib"]
irb(main):005:0> MachO::Tools.merge_machos("/tmp/xxx.dylib", *libs)
=> 3643072
$ lipo -info /tmp/xxx.dylib
fatal error: /Library/Developer/CommandLineTools/usr/bin/lipo: offset 1652032 of fat file /tmp/xxx.dylib (cputype (16777223) cpusubtype (3)) not aligned on its alignment (2^8)
$ file /tmp/xxx.dylib
/tmp/xxx.dylib: Mach-O universal binary with 2 architectures: [i386:Mach-O dynamically linked shared library i386] [x86_64]
/tmp/xxx.dylib (for architecture i386):	Mach-O dynamically linked shared library i386
/tmp/xxx.dylib (for architecture x86_64):	Mach-O 64-bit dynamically linked shared library x86_64

and

irb(main):012:0> static=["/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-i386/libcrypto.a", "/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-x86_64/libcrypto.a"]
=> ["/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-i386/libcrypto.a", "/tmp/uopenssl-20180702-16887-vjpn5r/openssl-1.0.2o/build-x86_64/libcrypto.a"]
irb(main):013:0> MachO::Tools.merge_machos("/tmp/xxx.a", *static)
MachO::MagicError: Unrecognized Mach-O magic: 0x213c6172
	from /usr/local/Homebrew/Library/Homebrew/vendor/macho/macho.rb:35:in `open'
	from /usr/local/Homebrew/Library/Homebrew/vendor/macho/macho/tools.rb:95:in `block in merge_machos'
	from /usr/local/Homebrew/Library/Homebrew/vendor/macho/macho/tools.rb:94:in `map'
	from /usr/local/Homebrew/Library/Homebrew/vendor/macho/macho/tools.rb:94:in `merge_machos'
	from (irb):13
	from /usr/local/Homebrew/Library/Homebrew/brew.rb:100:in `<main>'

What am I doing wrong, or did I ran into some kind of bug?

1.0 release

ruby-macho has matured a lot recently (and is on track to fulfil a lot of its original use cases), so it might be nice to cap off this summer's cumulative work with a symbolic 1.0 release. In my mind, that would mark the transition towards a stable API and a focus on more substantive unit tests.

Here's a shortlist of things that I'd like to include in a 1.0:

  • Removal of set_lc_str_in_cmd (#45, must-have)
  • Uniform LC deduplication (#41, important)
  • Proper isolation of constants (#65, important)
  • Improved encapsulation of roles (#57, important)
  • Broader LC creation/serialization support (important)
  • Performance tweaking/analysis (#63, important)
  • More fully featured command-line utilities (nice-to-have)
  • Cleaned up unit tests (#66. nice-to-have)
  • Cleaned up library code (lots of ugly bits currently floating around...) (#56, nice-to-have)
    • Remove @deprecated methods?

That list is by no means final, of course, and I'm very open to additional (or fewer!) goals for a 1.0. I have some even further goals out there for post-1.0 releases, which I can put in another issue if we think that's worth keeping track of at this point.

cc @UniqMartin

LCStr: Incorrect handling of null-terminated strings

The alot issue has also revealed another interesting, though completely unrelated, issue. Here's what I did with the alot-python binary (can provide it in full if it helps) after copying it from Cellar/alot/0.3.7/libexec/bin/python:

$ ruby -Ilib -rmacho -e 'puts MachO::Tools.dylibs("alot-python").sort' > dylibs-rb.log
$ otool -L alot-python | tail -n +2 | awk '{print $1}' | sort > dylibs-cc.log

The resulting diff:

diff --git 1/dylibs-cc.log 2/dylibs-rb.log
index 6286cda5..a0b685cc 100644
--- 1/dylibs-cc.log
+++ 2/dylibs-rb.log
@@ -1,3 +1,3 @@
 /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
 /usr/lib/libSystem.B.dylib
-@executable_path/../.Python
+@executable_path/../.Pythonython.framework/Versions/2.7/Python

The relevant section from the Mach-O file (the problematic load command starts at 0x13dc):

$ hexdump -vC alot-python
[…snip…]
000013c0  00 5e 00 00 28 00 00 80  18 00 00 00 07 0e 00 00  |.^..(...........|
000013d0  00 00 00 00 00 00 00 00  00 00 00 00 0c 00 00 00  |................|
000013e0  58 00 00 00 18 00 00 00  02 00 00 00 0a 07 02 00  |X...............|
000013f0  00 07 02 00 40 65 78 65  63 75 74 61 62 6c 65 5f  |....@executable_|
00001400  70 61 74 68 2f 2e 2e 2f  2e 50 79 74 68 6f 6e 00  |path/../.Python.|
00001410  79 74 68 6f 6e 2e 66 72  61 6d 65 77 6f 72 6b 2f  |ython.framework/|
00001420  56 65 72 73 69 6f 6e 73  2f 32 2e 37 2f 50 79 74  |Versions/2.7/Pyt|
00001430  68 6f 6e 00 0c 00 00 00  34 00 00 00 18 00 00 00  |hon.....4.......|
00001440  02 00 00 00 01 0a ca 04  00 00 01 00 2f 75 73 72  |............/usr|
00001450  2f 6c 69 62 2f 6c 69 62  53 79 73 74 65 6d 2e 42  |/lib/libSystem.B|
00001460  2e 64 79 6c 69 62 00 00  0c 00 00 00 68 00 00 00  |.dylib......h...|
00001470  18 00 00 00 02 00 00 00  00 0c e8 04 00 00 96 00  |................|
[…snip…]

The issue is a mix of LCStr improperly extracting the string from the binary and virtualenv doing really ugly and naive things in its mach_o_change function. We can't really do anything about virtualenv, but LCStr should really only fetch the null-terminated string at the given offset and not assume that the rest of the load command is zero-padded.

cc @woodruffw

RFC: Change indentation?

I use tabs in almost all of my projects, but the ruby world has more or less settled on two-space indentation.

Would changing ruby-macho to use two-space indentation be appreciated? Currently, whenever I update the vendored version in Homebrew, I run a little script to change it. I could just run that script in the repo and have it done in a single commit.

(Feel free to shoot this down if it's pointless - I just remembering seeing the indentation as a nit when originally getting ruby-macho merged in).

cc: @MikeMcQuaid @UniqMartin

3.0.0

#424 necessitates a major bump, and #425 needs a release as well.

Leaving this open as a reminder to myself.

Dependabot can't resolve your Ruby dependency files

Dependabot can't resolve your Ruby dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

Bundler::VersionConflict with message: Bundler could not find compatible versions for gem "ruby":
  In Gemfile:
    ruby (~> 2.1.10.0)

    rubocop (<= 0.58.2, >= 0.58.1) was resolved to 0.58.2, which depends on
      ruby (>= 2.2.0)

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

You can mention @dependabot in the comments below to contact the Dependabot team.

Benchmark Ruby-macho vs otool?

First of all, good work here. 🙇

Since now this is included in Homebrew, I'm curious to know the performance between Ruby implementation vs old otool. So how about adding benchmark along side test suite?

Expand binaries in test suite.

The current test suite binaries do not contain load commands like LC_LOAD_UPWARD_DYLIB and LC_LAZY_LOAD_DYLIB (#3). These need to be added to improve coverage.

Goals:

  • Add LC_LOAD_UPWARD_DYLIB binary and tests.
  • Add LC_LAZY_LOAD_DYLIB binary and tests.

Tag releases?

Since you are cutting releases with actual version numbers as can be seen on rubygems.org, I think it would be good to have that reflected in this repository. How do you feel about retroactively tagging previous releases and making that also a habit for future releases?

New DSL for MachOStructure

This is just for collecting my thoughts and getting feedback on a DSL for MachOStructure (and, by inheritance, just about every Mach-O data-oriented class in ruby-macho.

The current "DSL" for a Mach-O structure looks something like this:

class SomeImportantStruct < MachOStructure
  attr_reader :something
  attr_reader :something_else

  FORMAT = "L=2" # something and something_else are "L", meaning int32s
  SIZEOF = 8 # 2 x sizeof(int32)

  def initialize(something, something_else)
    # ...
  end 
end

This is incredibly repetitive, bug-prone, and difficult to read (there's no immediate way to determine the size of each field).

Something like this would be substantially better:

class SomeImportantStruct < MachOStructure
  field :int32, :something
  field :int32, :something_else

  # FORMAT, SIZEOF, and instantiation are automatically generated from the field definitions
end

Extremely early versions of ruby-macho actually had something like this, using CStruct from this repo, but I dropped it for reasons that I don't remember (probably having to do with me being unfamiliar with ruby DSL writing at the time).

This will probably require a large amount of refactoring and breakage, so I don't think it'll make it into 1.1 (or even 1.x, it might be destined for a major bump).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.