Code Monkey home page Code Monkey logo

Comments (9)

kostix avatar kostix commented on June 16, 2024

See issue #58, and in particular the last commend in its thread by sschuberth.

from git.

hvoigt avatar hvoigt commented on June 16, 2024

Getting rid of the hardlinked binaries is a feature request and not a bug. Please feel free to work on this. We will try to support you with enough information so you can implement it. Feel free to ask on the mailing list.

Since its not a bug I will close this issue.

from git.

sschuberth avatar sschuberth commented on June 16, 2024

@hvoigt Note that in the portable version we're not hard-linking any files. This is because the hard-linking is done by the installer, and for the portable version there is no installer (and the .7z archive does not support storing hardlinks). So the files under \libexec\git-core\ really are copies. I'm saying this just to set things straight, but still this is more a feature request than a bug.

from git.

sschuberth avatar sschuberth commented on June 16, 2024

Thinking about it, this is an issue tracker, not a bug tracker. As such, it's completely valid to post feature requests in here. I'm reopening this to track this feature request. However, there is no guarantee whatsoever that anyone will work on this any time soon.

from git.

dandv avatar dandv commented on June 16, 2024

Right, I didn't mean this as a bug. Thanks for reopening.

I don't know the internals of this, but I did see that the git.exe from the cmd directory was only 8k, so there must be ways to have some sort of smaller stub executables.

from git.

dscho avatar dscho commented on June 16, 2024

@dandv @sschuberth while this is not only a bug tracker but an issue tracker, it is still a tracker.

As such, I must ask whether anybody between the two of you is willing to spend any time at all on this.

from git.

dscho avatar dscho commented on June 16, 2024

So here goes: if any developer worth their salt wants to tackle this issue, I will list the relevant information and suggest a recipe how to address the problem properly (and no, do not ask me to work on this for you; avoiding to spend time yourself after I spent hours and hours on this would only serve as a public demonstration how little you value my time and expertise and effort):

The first thing is to note why the hard-linked builtins are there in the first place. And actually, before that the term builtin needs to be explained. Therefore, let's establish some background:

  • historically, Git was just a hodge-podge of shell scripts with the occasional C program thrown in for performance.
  • since the occasional C programs shared a lot of code, that code was refactored into a static library, libgit.a.
  • as libgit.a grew larger, there was indeed "huge bloat"; To solve that, the Git wrapper (what we call git.exe nowadays) wrapping multiple Git functions into a single executable was invented; It determines what function it should perform by inspecting the name by which it was called (using hard-links to allow for multiple names).
  • The functions thusly included in the Git wrapper are called builtins.
  • eventually, it was determined that the number of Git commands would clutter bin/ too much, and the Git wrapper learned to be started by the name git and to interpret the first argument as subcommand name in that case. The subcommands would then be hidden away in libexec/git-core/ (including non-builtins).
  • for performance reasons, many scripts still called the dashed form (to avoid the extra exec() call needed by calling git as an intermediary that exec()s the real program). To do so, they had to source git-sh-setup -- which for that reason could not be hidden away in libexec/git-core/ but still needed to be available on the PATH. This git-sh-setup scriptlet would extend the PATH to include the complete libexec/git-core/.
  • when the Git wrapper is called in a non-dashed form (e.g. git commit) to perform a builtin function, it does not exec() the dashed form but instead hands off to the respective function (by convention, cmd_<name> where <name> is the subcommand name with dashes replaced by underscores).
  • it was considered a cute extensibility feature that the Git wrapper would pick up any executable with a git- prefix as Git subcommand.

Now, Git prides itself with being backwards-compatible (indeed quite often to a fault: inconsistencies such as using the term cache -- referencing the original name for Git: dircache -- as well as unhelpful defaults are often maintained well beyond what some would call an acceptable time frame), therefore even Git for Windows has to adhere to that principle; Anything else would lead to a maintenance nightmare, for which reason the maintainer (= me) would not accept any contribution breaking the backwards-compatibility.

Backwards-compatibility in this case means that shell scripts calling the dashed forms will need to work properly, even after we remove them from libexec/.

One way to go about that would be by teaching git-sh-setup to provide dashed shell functions for the builtins. That would work for shell scripts, but of course not for Perl scripts exec()ing dashed Git programs.

Therefore, the best way to go about it is most likely to aim for Git 2.0.0 and convince upstream git.git (in the person of Junio Hamano) to accept the backwards-incompatible change for that version. Rumors have it that Git 2.0.0 is even more around the corner now than a few years ago.

I would not have a problem, BTW, to maintain these backwards-incompatible changes in Git for Windows earlier, iff they are accepted in upstream for 2.0.0.

So let's assume for now that we can get away with completely shunning the support for the dashed form of the builtins. Then the way to actually do it is as follows:

  1. if you haven't installed the net installer yet, it is high time to do so now.
  2. make your own fork of https://github.com/msysgit/git on GitHub.
  3. connect your /git/ to your GitHub repository by calling cd /git/ && git remote add -f <name> https://github.com/<name>/git, then make a branch by calling git checkout -b undash-builtins and connect it to your fork with git push --set-upstream <name> HEAD (where <name> is your GitHub account name).
  4. call git grep git- $(git ls-files \*.c \*.sh \*.perl) in /git/ to get an idea what code needs changing to support undashing the builtins. Pretty much all of them should be replaced by undashed calls (notable exception: git-merge-one-file is passed as a single argument to git merge-index in git-merge-octopus.sh and git-merge-resolve.sh; see the discussion below how to handle that).
  5. find the location in the Makefile that makes the hard links for builtins (look for the call to ln followed by the fall-back to ln -s in case hard linking is impossible). Disable it for builtins (see the note about git-remote-https.exe below for the discussion why we cannot disable all hard linking).
  6. run the complete test suite. The most convenient way might be to call cd /git/ && make && /share/msysGit/run-tests.sh because that will fail early on compilation errors, but run through all the tests without stopping when one fails so that you have an overview which tests you need to inspect (after the hours our test suite needs to run; it really shows that upstream git.git has no consideration for the performance problems incurred by over-using shell scripts the way they do).
  7. try to fix whatever you can fix easily, but do not hesitate at all to report back to the Git for Windows mailing list or to this issue when you get stuck. In that case, publish as much of your changes as you can (even if it is one monster Work-In-Progress commit in your fork) and explain what the symptoms are, with full logs.

That should take even a moderately talented programmer no more than a day, so there is really little excuse for asking others to scratch your itch here.

Now, let's look at the pesky git-merge-one-file problem:

Two shell scripts (that are rarely used, but still, they need to be supported) call git merge-index with git-merge-one-file as parameter. Replacing that by the undashed form would make it two parameters, breaking the scripts. Even quoting the undashed form -- to make it a single parameter again -- would not fix it: git merge-index would then try to call a program called git merge-one-file -- which does not exist.

There are two possible solutions:

  1. leave the git-merge-one-file parameter as-is: the command in question is not a builtin (indeed, it is a shell script!). That would work, and delay the proper resolution until the day when git-merge-one-file will be converted into a C builtin if that day ever comes, making it Someone Else's Problem.
  2. git merge-index would need to be changed so that it either accepts a special option, say --git, to know that the merge-one-file parameter refers not to an executable but to a Git subcommand, or so that it special-cases program names with a git- prefix by undashing them before calling exec().

Note: some remote helpers (e.g. git-remote-https.exe) use the same hard-link trick to hide implementations for multiple protocols in a single, multiply hard-linked executable (e.g. http:// as well as https:// handling, via cURL).

I have no good idea how to remove the need for hard-links in that case, without resolving to really ugly solutions.

And no, making the remote helpers builtins is not a solution: the HTTP handling was refactored out of the Git wrapper because Linus could shave of a couple of nanoseconds from the startup time of the Git wrapper by not linking to cURL (and in his setup, he uses at least one script that starts up the Git wrapper like there is no tomorrow: git-am). So unfortunately, this solution is out of the question as it would never be accepted by upstream (the proper solution, of course, would be to turn git-am into a builtin already -- long overdue!!! -- but for some reason, upstream git.git became very reluctant in replacing scripts (that are kept portable only by a tedious, ongoing effort) by proper, portable C versions).

Therefore, I suggest to leave those hard-linked remote helpers as-are.

I realize that this description comes over as a little too verbose at times, but I think it is important to keep all the background in mind when working on a solution.

@dandv so... with me spending quite a bit of time and expertise on explaining this issue, how about returning the favor?

from git.

dandv avatar dandv commented on June 16, 2024

Hi @dscho. Thank you very much for your help. My priorities have changed in the intervening year since I asked the question in this issue, and I hope another dev will put to good use the detailed background you've provided.

from git.

dscho avatar dscho commented on June 16, 2024

@dandv unfortunately, this does not surprise me. Let's close this issue for now because it is clear that nobody finds the issue offending enough to do more about it than to talk... so it cannot be all that bad! ;-)

from git.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.