Code Monkey home page Code Monkey logo

git-fat's People

Contributors

arnaudgelas avatar c00kiemon5ter avatar chhitz avatar chrismarinos avatar cspurk avatar halfvoxel avatar jedbrown avatar jmurty avatar mspacek avatar nkovacs avatar ottomata avatar thcipriani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

git-fat's Issues

git-fat --version?

We would like to track the versions of git-fat, but I can't seem to find a version reference. There is one reference to an environment variable GIT_FAT_VERSION.

git fat pull from mac to linux

I am trying to perform git fat pull, to pull git-fat objects from a MacPro to Linux. Its failing with following error

rsync: on remote machine: -sRe.LsfxC: unknown option
rsync error: syntax or usage error (code 1) at /BuildRoot/Library/Caches/com.apple.xbs/Sources/rsync/rsync-51/rsync/main.c(1337) [server=2.6.9]
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [Receiver=3.1.2]

Both machines have same version of rsync

Superfluous rsync options

Please consider removing superfluous rsync options set as default, like --progress and --ignore-existing.
Some workflows might require to avoid using --progress and --ignore-existing options.
If those options are wanted, they can already be added by using rsync.options in .gitfat.

git-fat behaves unexpectedly after performing "cp -a" on the repository

If I run "cp -a" to make a copy of the entire git repository, and I'm using git-fat, some unexpected behaviour is displayed: If I run "git status" on the copy I have made, it reports a number of modified files that weren't modified in the original repository.

What seems to be happening is that git-fat re-evaluates any large files that are stored directly in git, e.g. legacy files from before I started using git-fat, and moves them into the fat store.

I understand from reading the docs that git-fat is supposed to consider only new or modified files for inclusion into the fat store, and leave existing files alone. It seems that copying the repository with "cp -a" makes git-fat think that the files are new.

The steps to reproduce this are as follows:-

  1. Create a git repository.
  2. Commit some .tar.gz files.
  3. Set up git-fat, with a valid .gitfat file, remote store, and a .gitattributes file containing *.tar.gz.
  4. Commit the .gitfat and .gitattributes files.
  5. Run "git status" to check that there are no uncommitted changes.
  6. Use "cp -a" to take a copy of the repository: cp -a repository repository.copy
  7. cd repository.copy
  8. "git status" says that all the .tar.gz files committed in step (2) are modified.

Simply renaming the repository doesn't cause the issue. It's happens only when I copy the repository.

1 GB+ Files make shell unresponsive for several seconds

I don't know where else to put this, so I am just reporting this as an issue here. When git fat is used with files that are larger, the sha1 operation seems to dominate the CPU, causing the shell -- and possibly the system -- to become unresponsive for a while (10 sec - 1 min). A similar issue is noticed at the end of a git fat pull of files of this size. Perhaps changing the BLOCKSIZE would alleviate some of this?

git-fat pull and git-fat init

Hi,

while using git-fat, I came across a small "issue":
If one doesn't execute "git fat init" after cloning a repo that uses git-fat, the pull just wont succeed (it pulls the files into .git/fat but doesn't replace them in the index).
Unfortunately, there's no warning/message that indicates that the user forgot to run the "git fat init", it just "fails" silently.

Maybe it's possible to put a checkpoint in "git fat pull" to see if the init was already done. It cost me some hours to figure out what's going wrong (although it's mentioned in the README to run "git fat init", I just didn't see it)

git fat pull: no any response

Hi, sir,
when i git clone a package, and git fat init, git fat pull, nothing printed out.
When i run the test.sh script, here is the printed out, please help to check, whether or not it is a bug.
Thanks~!

[root@AY130628135803343949Z git-fat]# ./test.sh

  • rm -fR fat-test fat-test2 /tmp/fat-store
  • git init fat-test
    Initialized empty Git repository in /var/www/html/kidsitgit/git-fat/fat-test/.git/
  • cd fat-test
  • git fat init
    Initialized git fat
  • cat -
  • echo '*.fat filter=fat -crlf'
  • git add .gitattributes .gitfat
  • git commit '-mInitial fat repository'
    [master (root-commit) dbda5a2] Initial fat repository
    2 files changed, 3 insertions(+)
    create mode 100644 .gitattributes
    create mode 100644 .gitfat
  • ln -s /oe/dss-oe/dss-add-ons-testing-build/deploy/licenses/common-licenses/GPL-3 c
  • git add c
  • git commit '-madd broken symlink'
    [master 6c91860] add broken symlink
    1 file changed, 1 insertion(+)
    create mode 120000 c
  • echo 'fat content a'
  • git add a.fat
  • git commit '-madd a.fat'
    [master d69db8b] add a.fat
    1 file changed, 1 insertion(+)
    create mode 100644 a.fat
  • echo 'fat content b'
  • git add b.fat
  • git commit '-madd b.fat'
    [master 6660369] add b.fat
    1 file changed, 1 insertion(+)
    create mode 100644 b.fat
  • echo 'revise fat content a'
  • git commit '-amrevise a.fat'
    [master b25e963] revise a.fat
    1 file changed, 1 insertion(+), 1 deletion(-)
  • git fat push

Above command runs on CentOS6.3

Argument parsing with argparse?

I didn't want to pollute the PR discussion of #10 further, so I'll ask this way:
Considering that the API/argument structure will probably grow in complexity, have you considered using argparse to parse the arguments?
I've used it in the past and found it pretty convenient, with automatic exiting if the arguments don't match the expected structure, semi-automatic help-file generation, etc.

Retroactively adding files does not work

Hi

I really love this script, but there is one repo where I needed to retroactively add files to git-fat, I saw that there was an (experimental) way to do that so I tested it. Unfortunately it seems the script is completely outdated. Even the test script (test-retroactive.sh) fails to run.
I have tried to fix the script, but so far I haven't managed to get it working (even though it does proceed further now).
Is there a possibility that this could be fixed?

Support files other than binary

Perhaps it's my own mistake, but I haven't been able to git-fat to work on, for example, text files.

The reason I want to do this:

I work with large genomic sequence files. These files can be as big as ~3GB in my case. For now I'm going to just tar them and store them, but it would be nice if I didn't have to do that so that I could make my research very easy to reproduce.

My workflow would be as follows:

Git add source code ---> Git fat add sequence data --> commit

Boom! At any point, someone can reproduce that point in history that I was working. For now it is going to be as follows:

Git add source code ---> tar compress data ---> git fat add data ---> repeat for new data

The extra step seem unnecessary from my perspective. I tried git-annex but it was a bit complicated for my needs. I like how simple git-fat is, especially since I want other people to be able to easily reproduce the things I do.

pull full history and pull selected revision doesn't work

This is the output of git fat pull --all:

pulling:
[]
Pulling from remote_url_here
Executing: rsync --progress --ignore-existing --from0 --files-from=- remote_url_here .git/fat/objects/
receiving file list ... 
0 files to consider

sent 8 bytes  received 10 bytes  36.00 bytes/sec
total size is 0  speedup is 0.00

git fat doesn't properly track the running out of git repo

Hi, while testing the git-fat, i run:

git fat

out of git tree, to i can see usage message, but i get this ugly exception traceback:

git fat
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Traceback (most recent call last):
  File "/usr/local/bin/git-fat", line 581, in <module>
    fat = GitFat()
  File "/usr/local/bin/git-fat", line 125, in __init__
    self.gitroot = subprocess.check_output('git rev-parse --show-toplevel'.split()).strip()
  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['git', 'rev-parse', '--show-toplevel']' returned non-zero exit status 128

I am not sure, if raising the exception is good solution in this case.

regards

git fat pull tells rsync error

@jedbrown , Hello, sir,
When I finish mp3 files rsync to server via "git fat push", and then git clone the repo, git fat pull to retrieve the mp3 files, it always say rsync error.
It seems that inconsistent between file server and git-fat. I do not know whether or not it is a git fat bug or not.
How to locate that problem?
Thanks very much, it has bothered me one day.

"rsync: link_stat "/home/gitfat/gitfatlibs/d6bff0d70b60442d004c0affe6c1ab890615046a" failed: No such file or directory (2)
"
vagrant@homestead:~/kidsit$ git fat pull
[email protected]'s password:
receiving file list ...
rsync: link_stat "/home/gitfat/gitfatlibs/d6bff0d70b60442d004c0affe6c1ab890615046a" failed: No such file or directory (2)
rsync: link_stat "/home/gitfat/gitfatlibs/de51a8494180a6db074af2dee2383f0a363c5b08" failed: No such file or directory (2)
rsync: link_stat "/home/gitfat/gitfatlibs/8efe3f44b5143f09ff5cb033bac9aa0895b47364" failed: No such file or directory (2)
rsync: link_stat "/home/gitfat/gitfatlibs/f4b9a411943fe9d920c5d5a56975d12c35928172" failed: No such file or directory (2)
0 files to consider
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.1]

Put objects in tree (like git)

Git organizes its objects in a directory tree which reduces the number of files in a directory and allows for storage of a large number of objects. On some file-systems, having to many files in a directory will lead to performance problems. Will git-fat consider this?
Thanks,

  • Greg

Bare Repository

Hi,

I am following the following guide regarding updating a website:
http://toroid.org/ams/git-website-howto

Everything works great except when I issue the command git fat pull which then issues the following command:
rsync --progress --ignore-existing --from0 --files-from=- fat.antispaceman.com:/var/repository/ ./fat/objects/
I can ssh into that host and have checked that the git objects are there. However it always says that there are no objects and nothing to update. When I run the command directly, it just hangs. git fat status seems to be aware that there are files that need downloading but some reason rsync doesn't agree. I temporarily updated git fat on my server to do:
rsync --progress --ignore-existing --recursive fat.antispaceman.com:
/var/repository/ ./fat/objects/
Then git fat works as expected. I don't want to be running a custom git fat. I run git fat elsewhere and don't have a problem with it.

Can anyone enlighten me as to the cause of the problem?

Regards,
Matt

git fat init after clone - already configured?

Hi!
i am currently evaluating git-fat for usage.

i follow your readme and was able to add files via git-fat into my remote rsync-location with git fat push

i clone the repository.
git clone /home/jkr/src/testing_git
Klone nach 'testing_git'...
Fertig.
Usage: git fat [init|status|push|pull|gc|verify|checkout|find|index-filter]
Usage: git fat [init|status|push|pull|gc|verify|checkout|find|index-filter]

First question: something triggered git fat actions, but what?
Second question: if i now make a git fat init, it says:
Git fat already configured, check configuration in .git/config

but .git/config seems not to have any git fat entries?
.git/fat/objects is empty

the fat files (two jpeg files) are of size 0 (expecting file content with hash?)
git fat status shows
Orphan objects:
f07b6a730094a122e32677b4c2b187020ed4c491
64dfd7dfa562b4e0b673d538cfb43c4d0bad36fe
exactly the files i am missing
git fat pull says:
receiving file list ...
0 files to consider

seems something in git clone went totaly wrong.

the original git repository has the required entries in .git/fat/objects.

Any help welcome :-)

latest commit break git-fat find

$python git-fat find 1000000
Traceback (most recent call last):
  File "git-fat", line 616, in <module>
    fat.cmd_find(sys.argv[2:])
  File "git-fat", line 540, in cmd_find
    for path, sizes in sorted(pathsizes.items(), key=lambda p,s: max(s), reverse=True):
TypeError: <lambda>() takes exactly 2 arguments (1 given)

Reason for not merging Win32 fixes?

Just curious, any reason the python3-win32-compat branch wasn't merged into master at some point?

I've had to make some other win32 fixes for cwrsync compatibility, but I did all my work against master. I've considered submitting a pull request but I'm unsure where it shoud fit in, since you've got the python3-win32-compat branch there that also contains some win32 related fixes.

Expand environment variables in .gitfat

When the git repository is shared with collaborators, the git-fat remote might be present in different locations. For example, if the remote is a smb mount, we can't assume that all collaborators will mount the share in the same spot.

I worked around that by using return os.path.expandvars(output) rather than return output in the gitconfig_get function. This way, the base address of the remote could be specified by exporting an environment variable, e.g.:

[rsync]
remote = $SMB_ARCHIVE/gitfat/myrepo

Clearly this change would expand all fields, not just "rsync.remote", so you might prefer a different kind of fix.

Support filter-clean filter-smudge, pull, push for large file size in Windows v.s. S3

With this PR #28 we can use S3 as Backend.

But we run into limitation of 4GB for file in Windows when running filter-clean and filter-smudge, here is a workaround: PersonifyInc@7a1ec14

Then we run into another issue of upload large file to S3 due to S3 limitation of upload single part, here is a solution for it PersonifyInc@c026fcf

Would be great if we can add S3 support and then those 2 commits to make git-fat support S3 better.

Ran out of memory when git fat pull very large file

The following is a snippet of a git fat pull process that ran out of memory. Is there anything I can do to increase the memory? Thanks.

06:15:52 + git fat pull
06:16:39
receiving file list ...
06:16:39 2 files to consider
06:16:39 e0b0c99922873f2e442fc0ee71ec96a3672fcd79
...
...
...
06:32:28 sent 49 bytes received 24723052959 bytes 26010576.55 bytes/sec
06:32:28 total size is 24720035178 speedup is 1.0
06:35:28 fatal: Out of memory, realloc failed
06:35:29 Restoring e4d6644ce07a86924e5b73a639d71c68dc883511 -> EmpHydroCyclone.sim
06:35:29 Traceback (most recent call last):
06:35:29 File "/home/star/mirror/git/latest/linux-x86_64/bin/git-fat", line 530, in
06:35:29 fat.cmd_pull(sys.argv[2:])
06:35:29 File "/home/star/mirror/git/latest/linux-x86_64/bin/git-fat", line 380, in cmd_pull
06:35:29 self.checkout()
06:35:29 File "/home/star/mirror/git/latest/linux-x86_64/bin/git-fat", line 360, in checkout
06:35:29 subprocess.check_call(['git', 'checkout-index', '--index', '--force', fname])
06:35:29 File "/home/test/hudson/tool/linux-x86_64/lib/python2.7/subprocess.py", line 504, in check_call
06:35:29 raise CalledProcessError(retcode, cmd)
06:35:29 subprocess.CalledProcessError: Command '['git', 'checkout-index', '--index', '--force', 'EmpHydroCyclone.sim']' returned non-zero exit status 128

repo dirty after git fat pull

I have an environment that uses git fat.
After I checkout the environment and run git fat pull on the folder, it appears that the file has been "Modified" and I cannot do any checkout/rebase.
When deleting the file and doing a new checkout, the status remains the same, I'm like stuck with the current commit, unable to stash or do anything.

Problem with 0 length files

Hi

I have a problem with any file that has an extension that I have marked as being filtered by git-fat, but is zero length. It seems to be clean filtered when it is added to the repository, and smudge filtered when it is checked out, so all correct so far. But then git thinks that the file is dirty, so git status shows it as an unstaged change, and git diff shows a difference where the "old" state is the "#$# git-fat" token and the "new" state is the 0 length file.

Maybe git has some optimization where it decides not to run the smudge filter in some cases when the file is 0 length?

I have a workaround, which is to modify git-fat to do nothing in the clean filter when the file is zero length. Does this seem like a reasonable change?

diff --git a/git-fat b/git-fat
index e62f99b..0d8021a 100755
--- a/git-fat
+++ b/git-fat
@@ -244,7 +244,14 @@ class GitFat(object):
                     os.rename(tmpname, objfile)
                     self.verbose('git-fat filter-clean: caching to %s' % objfile)
                 cached = True
-                outstreamclean.write(self.encode(digest, bytes))
+                # Only write hash if the original file is not 0 length. If it
+                # is 0 length, just leave a 0 length file. This works around
+                # an apparent git problem where it thinks the working tree is
+                # dirty if the smudge filter generates a 0 length file.
+                if bytes != 0:
+                    outstreamclean.write(self.encode(digest, bytes))
+                else:
+                    self.verbose('git-fat filter-clean: not modifying 0 length file')
         finally:
             if not cached:
                 os.remove(tmpname)

git fat share object store must be read-write

I checked out 2 large repositories (70GB) twice on the same machine, but under different user accounts.

I'd like to share the git-fat object store between the two accounts, but only should have write access to
the objects. Using a symbolic link as suggested here, does not work:
https://github.com/jedbrown/git-fat#implementation-notes

Is there a way to add some kind of git-fat-ref link to the shared store, while using a local writable folder for the clear/smudge operations?

git status:
error: cannot feed the input to external filter git-fat filter-clean
error: external filter git-fat filter-clean failed 1
error: external filter git-fat filter-clean failed
Traceback (most recent call last):
File "/home/debugger/bin/git-fat", line 515, in
fat.cmd_filter_clean()
File "/home/debugger/bin/git-fat", line 255, in cmd_filter_clean
self.filter_clean(sys.stdin, sys.stdout)
File "/home/debugger/bin/git-fat", line 214, in filter_clean
fd, tmpname = tempfile.mkstemp(dir=self.objdir)
File "/usr/lib64/python2.7/tempfile.py", line 304, in mkstemp
return _mkstemp_inner(dir, prefix, suffix, flags)
File "/usr/lib64/python2.7/tempfile.py", line 239, in _mkstemp_inner
fd = _os.open(file, flags, 0600)

I need git fat to be able to use special username configured in ~/.gitconfig

Hello.
In My case the user name i am registerd with on a remote repository where i can store my binary files is different then the user name of my machine. And because others are using this repository as well, I can not push .gitfat file with my changes in the repo.
There fore i need a way to configure the special user name to be used by git-fat when accessing the remote repository.

git-fat pull performance

Our git repository has almost 31K commits spanning ten years of development (yes, we imported from another system. :).

git fat pull can take an extraordinarily long time to execute before getting to rsync-ing the files. git fat pull can take several hours on a branch new repository. An already updated repository can take 10-20 minutes to discover that it has nothing to do.

I'd like to encourage discussion about how we can make this process faster.

Perhaps some method of not investigating every single commit. Keeping tabs about where the last scan was successful. Just suggestions, I haven't dived into the code as of yet.

use git for fat object store

Instead of writing our own objects, we can use git to store the objects independently, tagged by sha1 of the content (using a lightweight tag). This is an eventual scalability problem because git does not support large numbers of refs well -- it ends up slowing everything down. Further discussion here:

http://thread.gmane.org/gmane.comp.version-control.git/182158

Instead, the objects could be packed up into commits, but due to the arbitrary subset problem, talking to remotes would not be as simple as push/pull of some refs. Instead, we'll probably need to create a new commit to pack up exactly what is requested.

Exception when pulling into a repository with symlink

Hi,

git fat pull seems to explode if there is a symlink in commited in the working directory.

rserve@rserve-devel:~/tomas/clone$ toh/608-gitfat-experimentalmerge * = git fat pull
receiving file list ... 
0 files to consider

sent 4 bytes  received 6 bytes  20.00 bytes/sec
total size is 0  speedup is 0.00
Restoring 6957a37c7a41455df2ced63fd6fb2e5202b83737 -> data/wikidi_digital-camera/2012-10-prvni-data/better.params
Traceback (most recent call last):
  File "/home/rserve/usr/git-fat/git-fat", line 467, in <module>
    fat.cmd_pull(sys.argv[2:])
  File "/home/rserve/usr/git-fat/git-fat", line 338, in cmd_pull
    self.checkout()
  File "/home/rserve/usr/git-fat/git-fat", line 306, in checkout
    for digest, fname in self.orphan_files():
  File "/home/rserve/usr/git-fat/git-fat", line 268, in orphan_files
    digest = self.decode_file(fname)[0]
  File "/home/rserve/usr/git-fat/git-fat", line 161, in decode_file
    stat = os.stat(fname)
OSError: [Errno 2] No such file or directory: 'data/wikidi_digital-camera/2012-10-prvni-data/crawl'

(data/wikidi_digital-camera/2012-10-prvni-data/crawl is a symlink)

sorry i suck at github, see the #9

Questions about .gitfat and .gitattributes files.

Hi,

First, I would like to express my appreciation over your work. I am currently using git-fat in the context of my company: we are using github enterprise rather happily, though, we experienced recently some issues with some repos storing large binary file (in our case .fla).
We are attempting to integrate git-fat in our workflow, we first stumbled over the issue that [rsync] always prompts the password request, but we got that fixed.

Now, I would like to ask you two questions:

  1. The properties file .gitfat specifies the remote location where to store 'the fat' but also the sshuser to connect as when accessing the remote repos. The .gitfat file is living within the project and hence is being commit and pushed to the normal git repos. I think this set-up is not ideal as it mixes up two different concepts: the location and the id to access it. The first one is common to any user working on the project and it makes absolutely sense to have it shared (its defines where the project store its "fat"), though, on the other hand, the sshuser is a pseudo private piece of data that varies according to who works on the project, it should not be commit/pushed to the repos.
    Looking into the git-fat script (from a non python coder eye), I could spot and modify where the sshuser is fetched and replace it by 'os.getlogin()', that suits my need as the value returned is the same as the one I use as sshuser on the remote location, but this is not ideal neither since it could be different. So, another way would be to obtain the sshuser from a global .gitfat config file that would live on the home directory, which once again, is not perfect, since there could be several project using git-fat, that push to different location using different sshuser. So, to conclude, my question would be a mix of: what would you recommend doing? do you plan on doing something for this issue?
  2. The .gitattributes file list the the filters, aka the file types to consider as "fatty". I think this is quite suited for this file to live where it does. I was though wondering about the notion of path. As a use case, let's consider that I have in my project some png files living in different folder, I would like to filter them by their location, I want to "git-fat" all png's living in the design folder, but not the ones living in the production folder. Do you reckon that would make sense to make the .gitattributes a bit more flexible ?

Thanks for your attention, looking forward reading you.
Cheers,

The safety of fat-store directory using git-fat

How secure is it to use git-fat with other people to manage the repository. In particular, it seems that everybody should have write access to the some fat-store directory. So anyone is able to delete something from fat-store? Is there something like pull request that allows each person has it own fat-store directory, yet allow pulling changes from others' fat-store directories?

git fat push : 0 files to consider

@jedbrown : Hello, i have setup everything according to your readme, i use the rsync.
In .gitfat:
[rsync]
remote = 112.124.37.169:/home/gitfat/gitfatlibs
sshuser = gitfat

After I have git add xxx.jpg and commit it, it seems ok.
The .git/fat/objects has one hashed file( I assume this should be the binary image file)
when I run : git fat push,
it always say:
[email protected]'s password:
building file list ...
0 files to consider
Can you help to figure out what happens here? What is the issue, is it a bug of git-fat itself?
Thanks!

git-fat filter-clean: cache already exists

Hi,

I'm using a git repo with git-fat for my large files. When I run "git status" the following message appears

git-fat filter-clean: cache already exists .git/fat/objects/da39a3ee5e6b4b0d3255bfef95601890afd80709

What does this mean and how can this be resolved? Commit, push and pull work without any problems.

Environment variables in .gitfat not working correctly

I'm trying to use an environment variable in the .gitfat file, like
-> cat .gitfat
[rsync]
remote = localhost:$GIT_FAT_DIR

and

-> echo $GIT_FAT_DIR
/tmp/gitfat

But it gives the result

-> git fat pull
Pulling from localhost:$GIT_FAT_DIR
Executing: rsync --progress --ignore-existing --from0 --files-from=- localhost:$GIT_FAT_DIR/ /tmp/repo/.git/fat/objects/
GIT_FAT_DIR: Undefined variable.
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) [receiver=3.0.6]

Is it possible to use a variable there, and if so, how do you write it?

fat pull/push in git hooks?

Is there any reason why the "git fat push" and "git fat pull" commands have not been put into git hooks and are executed automatically?

E.g. execute git fat push before any git push command and execute git fat pull after each checkout?

Enhancement: allow select files to go directly to git

git-fat is a wonderfully simple system. But I find one feature to be missing. I would like to keep some binary files directly under git. The current gitattributes looks like all or none rule of putting a file type under git-fat. It will be nice to have this capability.

Cannot stage binary file on Windows

After solving my Python version problem, git fat push / git fat pull executes smoothly. However, when trying to stage my first binary file, I get the following error:

Stage 1 files
error: cannot feed the input to external filter git-fat filter-clean
error: external filter git-fat filter-clean failed
Done

Looks like something is still misconfigured? I executed git fat init, configured .gitfat and .gitattributes correctly. What else could I have done wrong?

Is git-fat at the moment Windows ready?

rsync exits in Windows mingw

I've cobbled together a windows/git/git-fat environment that generally seems to work. However, rsync aborts immediately after it is started by git-fat.

rsync: read error: Connection reset by peer (104)
rsync error: unexplained error (code 255) at /usr/src/rsync/rsync-3.0.8/io.c(760
) [Receiver=3.0.8]

The cobbled together environment is setup as:

  1. Install msysgit using option 2 (all executables available from Windows).
  2. Install mingw
  3. install rsync using mingw-get install msys-rsync
  4. Install Python for windows (Python 2.7)
  5. Update the Windows system path to include C:\Python27.
  6. Add git-fat to a directory on the path (i.e. /bin).

I can run all the pieces separately. Whenever git-fat executes rsync, rsync exits.

I can run manually.

  • git fat init
  • git fat status > fat-files.txt
  • edit fat-files.txt in order to feed into rsync

  • rsync --progress --ignore-existing --files-from=- "--rsh=ssh -l user" path:fat-store/repo/ .git\fat\objects/ < fat-files.txt
  • git fat pull

The last command shows rsync aborting, but the python script continues and fixes up all the files.

It seems that many have gone through this situation. I've managed to update for binary files. How can I get rsync to not exit?

Cannot execute "git fat init" on Windows

After following the installation guide, I've tried to execute "git fat init" on my repository, however, I only get the following error:

user@pc ~/Desktop/git-fat/BareTestRepoClone1 (master)
$ git fat init
  File "c:\Users\user\Desktop\git-fat\git-fat-master\git-fat", line 526
    for path, sizes in sorted(pathsizes.items(), cmp=lambda (p1,s1),(p2,s2): cmp(max(s1),max(s2)), reverse=True):
                                                            ^
SyntaxError: invalid syntax

What's wrong? Do I have to use Python 2.x instead of 3.x?

git fat pull hangs on linux

I'm using git-fat on linux with git version 1.8.5.5 and python version 2.6.6.

git fat pull hangs.

My preliminary debugging indicates that filter_gitfat_candidates never completes.

All three data processing threads start (cut_sha1hash, filter_gitfat_candidates, and metadata processing). cut_sha1hash ends, but filter_gitfat_candidates never completes, hanging the program.

I am using the latest git-fat code (is git-fat versioned yet?)

I wanted to report this as soon as possible. I will continue debugging.

Hints to a possible problem would be great.

git fat pull doesn't work in subdirectories

If I'm in a subdirectory of my git repository and I try to git fat pull, the proper files will not be pulled and linked up properly, but no errors are reported. Is this intended behavior? git works the same in any subdirectory of a git repository, always finding the closest .git from . upward; it would make sense to me that git-fat should do the same.

Python 3.3 incompatibility

git-fat seems not to work under python 3.3 (at least on x64 windows 7). Most of the issues were around determining string encoding. After a bit of tweaking, it seems to work with the attached changes. They're a bit of a hack, but they seem to get the job done.

diff --git a/git-fat b/git-fat
index 799a05a..bb79874 100755
--- a/git-fat
+++ b/git-fat
@@ -90,13 +90,13 @@ class GitFat(object):
         self.verbose = verbose_stderr if os.environ.get('GIT_FAT_VERBOSE') else verbose_ignore
         self.gitroot = subprocess.check_output('git rev-parse --show-toplevel'.split()).strip()
         self.gitdir = subprocess.check_output('git rev-parse --git-dir'.split()).strip()
-        self.objdir = os.path.join(self.gitdir, 'fat', 'objects')
+        self.objdir = os.path.join(self.gitdir.decode('utf-8'), "fat", "objects")
         if os.environ.get('GIT_FAT_VERSION') == '1':
             self.encode = self.encode_v1
         else:
             self.encode = self.encode_v2
         def magiclen(enc):
-            return len(enc(hashlib.sha1('dummy').hexdigest(), 5))
+            return len(enc(hashlib.sha1("dummy".encode('utf-8')).hexdigest(), 5))
         self.magiclen = magiclen(self.encode) # Current version
         self.magiclens = [magiclen(enc) for enc in [self.encode_v1, self.encode_v2]] # All prior versions
     def setup(self):
@@ -193,7 +193,7 @@ class GitFat(object):
                             ishanging = True              # Working tree version is verbatim from repository (not smudged)
                             outstream = outstreamclean
                         firstblock = False
-                    h.update(block)
+                    h.update(block.encode('utf-8'))
                     bytes += len(block)
                     outstream.write(block)
                 outstream.flush()
@@ -399,7 +399,7 @@ class GitFat(object):
         time1 = time.time()
         self.verbose('Found %d paths in %.3f s' % (len(pathsizes), time1-time0))
         maxlen = max(map(len,pathsizes)) if pathsizes else 0
-        for path, sizes in sorted(pathsizes.items(), cmp=lambda (p1,s1),(p2,s2): cmp(max(s1),max(s2)), reverse=True):
+        for path, sizes in sorted(pathsizes.items(), key=lambda item: item[1], reverse=True):
             print('%-*s filter=fat -text # %10d %d' % (maxlen, path,max(sizes),len(sizes)))
         revlist.wait()
         difftree.wait()

Where should `.gitattributes` be put?

The README says the following.

Edit .gitattributes to regard any desired extensions as fat files.

However, it is not clear where this .gitattributes should be put. Is it the git repository on which I want to use git-fat? Or it is in my home directory. Can somebody update the README to make it clear? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.