Comments (18)
I like the idea in principle, but I think this repository has enough followers that it may cause harm to drop the old content.
In practice, the repository is quite small even at 11MiB.
If you decided to do it, you could keep a copy of the old repository around at llir/llvm-legacy
, analogously to https://github.com/go-gl-legacy/gl - and that way if anyone does need the old content (e.g, they were depending on a specific git hash) then at least it still exists for the purposes of figuring out what the git hash is in the new repository.
Maybe it's possible to keep the old history around in a separate git ref which doesn't get cloned by default. But in that case I guess the content would be harder to discover.
from llvm.
@pwaller why matplotlib? gonum/plot
is so much better :P
from llvm.
Thanks for creating the issue, I agree, it's easier to keep the discussion here.
First, can I clarify the question - are you asking how to remove lots of old large assets from the history of the repository?
Exactly!
On the other hand, I did a git clone
just now to check the size of the repo, and it wasn't as bad as I had thought. Perhaps we don't need to do this after all.
[u@x1 ~]$ time git clone https://github.com/llir/llvm
Cloning into 'llvm'...
remote: Enumerating objects: 20, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 10760 (delta 10), reused 15 (delta 6), pack-reused 10740
Receiving objects: 100% (10760/10760), 9.03 MiB | 1.47 MiB/s, done.
Resolving deltas: 100% (6437/6437), done.
real 0m7.715s
user 0m1.810s
sys 0m0.521s
[u@x1 ~]$ du -hs llvm
11M llvm
I think we could prune the repo down to 1 MB or so instead of 11 MB, but the question is if it's worth it, given that it requires a force push.
However, should we decide to do this, then having it ready before the v0.3.0 release seems like a perfect time.
from llvm.
I like the idea in principle, but I think this repository has enough followers that it may cause harm to drop the old content.
In practice, the repository is quite small even at 11MiB.
Agreed. Had the repo been at 100 MB, then we probably would have done it, but at this size it does not seem worth the potential harm to users. (The idea with shrinking the repo was of course to make it easier for users to make the initial download, especially those who happen to be on a slow Internet connection, as may be common in parts of Asia, etc).
So, for now. I'm fine with keeping it as it is, and just being careful when adding large content in the future. Closing this issue for now. We can always refer back and re-open at a later point.
from llvm.
Maybe it's possible to keep the old history around in a separate git ref which doesn't get cloned by default. But in that case I guess the content would be harder to discover.
Also, if Go ever does shallow Git clone, this issue would be resolved I think. (upstream issue golang/go#13078)
from llvm.
I just learned that Go did this to their repository recently, the discussion in there and how they went about it is pretty interesting:
I think it probably doesn't change anything with respect to what we might do to this repository.
from llvm.
I just learned that Go did this to their repository recently, the discussion in there and how they went about it is pretty interesting:
Thanks for the link! It was an interesting read to see how they resolved it.
I think it probably doesn't change anything with respect to what we might do to this repository.
Most likely not. If we end up doing a pruning, then I'd suggest we use bfg as suggested on the GitHub link you posted. Also, if we do this, then perhaps in the next few weeks, as the intention is to have v0.3.0 released some time in early December.
I'm kind of still a bit on the fence. I don't think we need the rewrite. However, should we ever do one, now is basically the perfect time to. As we move from v0.2 to v0.3, since users will have to do manual changes to get the latest release anyways (updating to the latest API, etc).
from llvm.
Until we decide for sure. I'll re-open the issue. Also, this may help get input from other users of the repo who it may affect. I'll also re-name the title to include a mention of Git history rewrite.
from llvm.
Some large paths:
git rev-list --objects --all | git cat-file --batch-check='%(objectsize:disk) %(objectname) %(objecttype) %(rest)' | grep ' blob ' | awk '{print $4" "$1}' | awk '{
arr[$1]+=$2
}
END {
for (key in arr) printf("%s\t%s\n", arr[key], key)
}' | sort -nr | awk '{print $2"\t"$1}' | column -t -s$'\t' | head
old/asm/internal/testdata/sqlite/sqlite3.ll 3404085
old/asm/internal/testdata/sqlite/sqlite3.c 1726782
asm/internal/parser/actiontable.go 472010
old/asm/internal/parser/actiontable.go 246534
asm/internal/parser/gototable.go 149186
old/asm/internal/parser/gototable.go 67735
asm/testdata/DebugInfo/COFF/big-type.ll 61648
asm/internal/ll.bnf 59883
asm/ll/ll.tm 59610
asm/testdata/c4.ll 55555
This graph shows how much space will be saved, assuming you eliminate large file paths:
from llvm.
from llvm.
The current intention is to clone llir/llvm into llir/llvm-legacy, to preserve the complete history. Then, to start clean, we will keep any fine currently in HEAD, and it's entire history at that path. Since we need to do a force push anyway, this seem to be the time to really get the size of the repo down.
If anyone currently using the repo has some input or feedback, feel welcome to contribute your thoughts.
from llvm.
@mewmew and I propose to run the following:
$ du --apparent-size -sch .git
9.5M .git
9.5M total
# Kill objects at and before v0.2.1
git rev-list --objects v0.2.1 | awk '{print $1}' > killset.txt
# Kill unwanted objects - testdata, textmapper and other experimental code.
git rev-list --objects --all | git cat-file --batch-check='%(objectname) %(rest)' | egrep '(/testdata/| l/|\.tm$)' | awk '{print $1}' >> killset.txt
java -jar ~/Downloads/bfg-1.13.0.jar -bi killset.txt
git repack -a && git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ du --apparent-size -sch .git
800K .git
800K total
from llvm.
See https://github.com/llir/llvm-clean for the new repository. The intent is to force push the HEAD of that repository into llir/llvm
at some point (or to redo the above commands against this repository assuming development continues here for now).
from llvm.
Here is a list of known users of llir/llvm:
- https://github.com/elz-lang/elz (
merged dannypsnl/elz#115) - https://github.com/NateGraff/blessedvirginmary (upstream PR NateGraff/blessedvirginmary#2)
- https://github.com/niklaskorz/nklang (
niklaskorz/nklang#7 merged) - https://github.com/rai-project/plini (
rai-project/plini#1 merged) - https://github.com/scottshotgg/express_old (upstream PR scottshotgg/express_old#4)
- https://github.com/scottshotgg/llvmIRTest (upstream PR scottshotgg/llvmIRTest#1)
- https://github.com/zegl/tre (
merged zegl/tre#93) - https://github.com/geode-lang/geode (
geode-lang/geode#24 merged)
We can try to be good open source citizens and send PRs to update their usage to v0.3.x :) Once this is done, we update llir/llvm with the cleaned version of the repo (currently living at https://github.com/llir/llvm-clean).
Edit: the llir/llvm-clean repository has now been removed, as it's been integrated back into llir/llvm.
from llvm.
https://github.com/reedkotler/scala-llc doesn't seem to contain any go code?
from llvm.
https://github.com/reedkotler/scala-llc doesn't seem to contain any go code?
Oh, the code match was from the BNF https://github.com/reedkotler/scala-llc/blob/ff3578b14171a5332e1c7f972c0c40b32f7a9e4c/ll.bnf#L187
<< import (
"github.com/llir/llvm/asm/internal/ast"
"github.com/llir/llvm/asm/internal/astx"
) >>
We can remove it from the list.
from llvm.
I'd like to trim the llir/llvm
repo size today, using the approach outlined by @pwaller in #38 (comment), essentially the earlier we do this the better. So we can keep Git history intact going forward.
from llvm.
On the 30th of November we pruned the using BFG to reduce its initial download size. The following commands were run at the old revision d3f412d.
$ du --apparent-size -sch .git
9.6M .git
9.6M total
# Kill objects at and before v0.2.1
git rev-list --objects 7a17b32c1767cfeb5287d164e92865adb98985c8 | awk '{print $1}' > killset.txt
# Kill unwanted objects - testdata, textmapper and other experimental code.
git rev-list --objects --all | git cat-file --batch-check='%(objectname) %(rest)' | egrep '(/testdata/| l/|\.tm$)' | awk '{print $1}' >> killset.txt
bfg -bi killset.txt
git repack -a && git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ du --apparent-size -sch .git
934K .git
934K total
from llvm.
Related Issues (20)
- `SRet` struct location HOT 3
- How to use Function Pointers? HOT 3
- Wrong function pointer type HOT 1
- indirect br only takes constant addr HOT 1
- Returning void as a value HOT 3
- update llir/llvm to support 14.0 HOT 2
- update llir/llvm to support LLVM 15.0 HOT 7
- How to have two structs referencing eachother HOT 4
- How to convert to exe? HOT 3
- `module.NewTypeDef` has unexpected side effects HOT 17
- Provide working Hello, World example HOT 1
- support llvm 16.0.0 HOT 2
- incorrect cmpxchg signature HOT 4
- How to generate ir for embedded struct? HOT 6
- Convert to string HOT 1
- Hello 10 instead of Hello world. Passing a IntPtr to printf (go - LLIR/LLVM) HOT 3
- malloc and strcpy in LLVM HOT 1
- unable to generate LLVM: structure, external , noundef and ptr HOT 1
- No way to find underlying type of pointer or array HOT 1
- How to compile LLVM IR to machine code programatically? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llvm.