Comments (9)
See issue #58, and in particular the last commend in its thread by sschuberth.
from git.
Getting rid of the hardlinked binaries is a feature request and not a bug. Please feel free to work on this. We will try to support you with enough information so you can implement it. Feel free to ask on the mailing list.
Since its not a bug I will close this issue.
from git.
@hvoigt Note that in the portable version we're not hard-linking any files. This is because the hard-linking is done by the installer, and for the portable version there is no installer (and the .7z archive does not support storing hardlinks). So the files under \libexec\git-core\
really are copies. I'm saying this just to set things straight, but still this is more a feature request than a bug.
from git.
Thinking about it, this is an issue tracker, not a bug tracker. As such, it's completely valid to post feature requests in here. I'm reopening this to track this feature request. However, there is no guarantee whatsoever that anyone will work on this any time soon.
from git.
Right, I didn't mean this as a bug. Thanks for reopening.
I don't know the internals of this, but I did see that the git.exe from the cmd
directory was only 8k, so there must be ways to have some sort of smaller stub executables.
from git.
@dandv @sschuberth while this is not only a bug tracker but an issue tracker, it is still a tracker.
As such, I must ask whether anybody between the two of you is willing to spend any time at all on this.
from git.
So here goes: if any developer worth their salt wants to tackle this issue, I will list the relevant information and suggest a recipe how to address the problem properly (and no, do not ask me to work on this for you; avoiding to spend time yourself after I spent hours and hours on this would only serve as a public demonstration how little you value my time and expertise and effort):
The first thing is to note why the hard-linked builtins are there in the first place. And actually, before that the term builtin needs to be explained. Therefore, let's establish some background:
- historically, Git was just a hodge-podge of shell scripts with the occasional C program thrown in for performance.
- since the occasional C programs shared a lot of code, that code was refactored into a static library,
libgit.a
. - as
libgit.a
grew larger, there was indeed "huge bloat"; To solve that, the Git wrapper (what we callgit.exe
nowadays) wrapping multiple Git functions into a single executable was invented; It determines what function it should perform by inspecting the name by which it was called (using hard-links to allow for multiple names). - The functions thusly included in the Git wrapper are called builtins.
- eventually, it was determined that the number of Git commands would clutter
bin/
too much, and the Git wrapper learned to be started by the namegit
and to interpret the first argument as subcommand name in that case. The subcommands would then be hidden away inlibexec/git-core/
(including non-builtins). - for performance reasons, many scripts still called the dashed form (to avoid the extra
exec()
call needed by callinggit
as an intermediary thatexec()
s the real program). To do so, they had to sourcegit-sh-setup
-- which for that reason could not be hidden away inlibexec/git-core/
but still needed to be available on thePATH
. Thisgit-sh-setup
scriptlet would extend thePATH
to include the completelibexec/git-core/
. - when the Git wrapper is called in a non-dashed form (e.g.
git commit
) to perform a builtin function, it does notexec()
the dashed form but instead hands off to the respective function (by convention,cmd_<name>
where<name>
is the subcommand name with dashes replaced by underscores). - it was considered a cute extensibility feature that the Git wrapper would pick up any executable with a
git-
prefix as Git subcommand.
Now, Git prides itself with being backwards-compatible (indeed quite often to a fault: inconsistencies such as using the term cache -- referencing the original name for Git: dircache -- as well as unhelpful defaults are often maintained well beyond what some would call an acceptable time frame), therefore even Git for Windows has to adhere to that principle; Anything else would lead to a maintenance nightmare, for which reason the maintainer (= me) would not accept any contribution breaking the backwards-compatibility.
Backwards-compatibility in this case means that shell scripts calling the dashed forms will need to work properly, even after we remove them from libexec/
.
One way to go about that would be by teaching git-sh-setup
to provide dashed shell functions for the builtins. That would work for shell scripts, but of course not for Perl scripts exec()
ing dashed Git programs.
Therefore, the best way to go about it is most likely to aim for Git 2.0.0 and convince upstream git.git (in the person of Junio Hamano) to accept the backwards-incompatible change for that version. Rumors have it that Git 2.0.0 is even more around the corner now than a few years ago.
I would not have a problem, BTW, to maintain these backwards-incompatible changes in Git for Windows earlier, iff they are accepted in upstream for 2.0.0.
So let's assume for now that we can get away with completely shunning the support for the dashed form of the builtins. Then the way to actually do it is as follows:
- if you haven't installed the net installer yet, it is high time to do so now.
- make your own fork of https://github.com/msysgit/git on GitHub.
- connect your
/git/
to your GitHub repository by callingcd /git/ && git remote add -f <name> https://github.com/<name>/git
, then make a branch by callinggit checkout -b undash-builtins
and connect it to your fork withgit push --set-upstream <name> HEAD
(where<name>
is your GitHub account name). - call
git grep git- $(git ls-files \*.c \*.sh \*.perl)
in /git/ to get an idea what code needs changing to support undashing the builtins. Pretty much all of them should be replaced by undashed calls (notable exception:git-merge-one-file
is passed as a single argument togit merge-index
ingit-merge-octopus.sh
andgit-merge-resolve.sh
; see the discussion below how to handle that). - find the location in the
Makefile
that makes the hard links for builtins (look for the call toln
followed by the fall-back toln -s
in case hard linking is impossible). Disable it for builtins (see the note aboutgit-remote-https.exe
below for the discussion why we cannot disable all hard linking). - run the complete test suite. The most convenient way might be to call
cd /git/ && make && /share/msysGit/run-tests.sh
because that will fail early on compilation errors, but run through all the tests without stopping when one fails so that you have an overview which tests you need to inspect (after the hours our test suite needs to run; it really shows that upstream git.git has no consideration for the performance problems incurred by over-using shell scripts the way they do). - try to fix whatever you can fix easily, but do not hesitate at all to report back to the Git for Windows mailing list or to this issue when you get stuck. In that case, publish as much of your changes as you can (even if it is one monster Work-In-Progress commit in your fork) and explain what the symptoms are, with full logs.
That should take even a moderately talented programmer no more than a day, so there is really little excuse for asking others to scratch your itch here.
Now, let's look at the pesky git-merge-one-file
problem:
Two shell scripts (that are rarely used, but still, they need to be supported) call git merge-index
with git-merge-one-file
as parameter. Replacing that by the undashed form would make it two parameters, breaking the scripts. Even quoting the undashed form -- to make it a single parameter again -- would not fix it: git merge-index
would then try to call a program called git merge-one-file
-- which does not exist.
There are two possible solutions:
- leave the
git-merge-one-file
parameter as-is: the command in question is not a builtin (indeed, it is a shell script!). That would work, and delay the proper resolution until the day whengit-merge-one-file
will be converted into a C builtin if that day ever comes, making it Someone Else's Problem. git merge-index
would need to be changed so that it either accepts a special option, say--git
, to know that themerge-one-file
parameter refers not to an executable but to a Git subcommand, or so that it special-cases program names with agit-
prefix by undashing them before callingexec()
.
Note: some remote helpers (e.g. git-remote-https.exe
) use the same hard-link trick to hide implementations for multiple protocols in a single, multiply hard-linked executable (e.g. http://
as well as https://
handling, via cURL).
I have no good idea how to remove the need for hard-links in that case, without resolving to really ugly solutions.
And no, making the remote helpers builtins is not a solution: the HTTP handling was refactored out of the Git wrapper because Linus could shave of a couple of nanoseconds from the startup time of the Git wrapper by not linking to cURL (and in his setup, he uses at least one script that starts up the Git wrapper like there is no tomorrow: git-am
). So unfortunately, this solution is out of the question as it would never be accepted by upstream (the proper solution, of course, would be to turn git-am
into a builtin already -- long overdue!!! -- but for some reason, upstream git.git became very reluctant in replacing scripts (that are kept portable only by a tedious, ongoing effort) by proper, portable C versions).
Therefore, I suggest to leave those hard-linked remote helpers as-are.
I realize that this description comes over as a little too verbose at times, but I think it is important to keep all the background in mind when working on a solution.
@dandv so... with me spending quite a bit of time and expertise on explaining this issue, how about returning the favor?
from git.
Hi @dscho. Thank you very much for your help. My priorities have changed in the intervening year since I asked the question in this issue, and I hope another dev will put to good use the detailed background you've provided.
from git.
@dandv unfortunately, this does not surprise me. Let's close this issue for now because it is clear that nobody finds the issue offending enough to do more about it than to talk... so it cannot be all that bad! ;-)
from git.
Related Issues (20)
- Redirect to Git for Windows 2.x HOT 10
- Why I am not able to use curl commands ? HOT 1
- "Entrust Root Certification Authority - G2" is not trusted HOT 5
- .git subdirectory is created in the wrong place on checkout when using relative directory HOT 16
- umeng123
- Windows installer not cryptographically signed. HOT 3
- Using mingw-perl HOT 5
- git rev-list crashes when --show-notes and --grep are specified HOT 2
- Git SVN crash HOT 3
- Git daemon and TortoiseGit HOT 2
- fox
- Git Blame - file.dot is not a Word Document HOT 3
- Missing patch tool for release candidate of Git for Windows. HOT 1
- Custom mergetool cmd support in git GUI HOT 1
- Git.exe is stuck if I run a wrong command. HOT 1
- Not able to clone gerrit repo over https via gitbash(windows client) HOT 5
- please add a release to announce the move of release downloads to the git-for-windows repo HOT 7
- <invalid>
- Filename too long when cloning a repo HOT 3
- Cannot Create Shortcut, Cannot Find App, Cannot Open App HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from git.