Code Monkey home page Code Monkey logo

ast-merging-evaluation's People

Contributors

benedikt-schesch avatar mernst avatar phdn avatar ryanfeatherman avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ast-merging-evaluation's Issues

Run time difference in `make small-test`

I ran
make small-test
and got the error
ValueError: test/small-goal-files/result.csv and results-small/result.csv are not equal

The problem was a difference in the run time. Is such a failure desirable?

Add index to FindMergeCommits

Could we add an index to the output of FindMergeComits, this will allow me to preserve that ordering throughout the pipeline.

FindMergeCommits indexing race condition

The output of FindMergeCommits suffers from race conditions. The problem is related to the idx where the idx of a repo ends up in a completely different output.
A parallel run yielded Algorithms.csv:

idx,branch_name,merge_commit,parent_1,parent_2,notes
1,refs/heads/master,44ce61beeb8bb5c5c1289db02f1da53f6bff6d8a,e121b3943958ba1558d8828db6f13c632e99f298,58129e828567fac52e821ca147feeab87a95d780,trivial merge
2,refs/heads/master,e866ae7916d6b763f7d32f72f54edd735b5ea6bc,ae5dac889d19edd6c773e60420f00d4e7177f31f,4342b16e18da9082b32d5b8e9b5b6e3378461ce4,
3,refs/heads/master,ae5dac889d19edd6c773e60420f00d4e7177f31f,38ec459e6f163991c6f88210720dcda4fecc64e8,eb3260404e596031bc9ee037b29bd1fcbb87f625,trivial merge
4,refs/heads/master,38ec459e6f163991c6f88210720dcda4fecc64e8,60d87e9a120e330eb34c9d558d0f00e52b5ce438,aa882141f26ef82107658c214ec18a799403bb57,trivial merge
5,refs/heads/master,d7e0b1c03fa7cacf40b32dcfde30871e8947445a,2c4e69eb243a81ba05d24456b3cad7e8a696f630,8a1a1a7fb53732edaf17ef9876545c4bca568e05,
6,refs/heads/master,60d87e9a120e330eb34c9d558d0f00e52b5ce438,bceffabe0d850ea9c67c93a9fb578d2ca8600022,9aa3611952f1715d8f5cf6404a3dad0e95bf175f,trivial merge
7,refs/heads/master,bceffabe0d850ea9c67c93a9fb578d2ca8600022,247b66cb2c7792336c5d1c2c1502a55d5222c823,67360ca7159643f3a9fc20f30e164b36a4e4f859,
8,refs/heads/master,247b66cb2c7792336c5d1c2c1502a55d5222c823,c7c288a43e324748f1f25ee72f9639d808c70b7e,7f89b8c5fbe85848e96a0cdaffb83664588c8474,trivial merge
9,refs/heads/master,c7c288a43e324748f1f25ee72f9639d808c70b7e,ffa4a6ccd332fad6dc1c5a269b3854a5a2bfff91,296f6b09446b63e8b036e4840879c2d890ae925d,trivial merge
10,refs/heads/master,216529baa98d8646b192b1c6897b01d3181f2ac6,ffa4a6ccd332fad6dc1c5a269b3854a5a2bfff91,deaa316e2165940e50e4e00e83e18d27417fb642,trivial merge
11,refs/heads/master,ffa4a6ccd332fad6dc1c5a269b3854a5a2bfff91,128ccf0f25f401f2218c03166a6f4ebc5ce21385,03d7c1872d5b17434fd460f7b8a0b8c9c90849f7,trivial merge
12,refs/heads/master,128ccf0f25f401f2218c03166a6f4ebc5ce21385,1f8f61c50d4e36ab8677b271a58648e90a2f1b28,22ee3ce03f74349ed3b0f36bfdf1dc119e2be70b,trivial merge
13,refs/heads/master,c8604d6f6d23a0cd5d3f9fac472dbf467fe11b8e,7ebdadbc657a68da22826ce3cc402e3635e80d4c,fd87cbc2cdbe813564f0a2c7e8070964220cdcc6,trivial merge
14,refs/remotes/origin/pull/27,fdd92255570bf37534ba36f632773923d60e32d2,8edac208c376e88626e88a720c9166fef9588853,1a71e9f95183a089b2b72a0cb42337681594e699,
15,refs/remotes/origin/pull/37,6b917cc6a33eb87fec2dcdafc845177f470ca374,8ef7da30196fba8e55947acf6168d8cd4a82f2b8,68f656ff10751259284d083e8fdf8e82f7ada15f,
16,refs/remotes/origin/pull/37,8ef7da30196fba8e55947acf6168d8cd4a82f2b8,37e80741a853a6626a77aed38caa07f4e44d85b2,8b2e3b06185e4a111ae987db7fa01901974036cd,
17,refs/remotes/origin/pull/37,680f2603dc41bb7c5843b0fa342e829210611d9d,1477ce25eaad76986f50aa5b740e2c023ff31de2,879fb60972e4e83fdb7b5337db66c8afbfb95ffc,
18,refs/remotes/origin/pull/37,d1279d1b59ecba7f01de8b2c1995562f92b2e255,b636ef254a97ae830baa6d783121de7890818785,9497e2f66d7af3feedd3106faa6e8b0aa6118b41,
19,refs/remotes/origin/pull/37,104eb9ed88cde145576ea1df50460915af488e74,6baa2ed219e18f9763bb2f2de74c5fb800d15899,99a155f67dfe68de440feb560a7e4c4ee5ae355c,
15,refs/remotes/origin/pull/73,9c8740d0e38ba93f08bfb462d16cab448a0197f1,7ebdadbc657a68da22826ce3cc402e3635e80d4c,4e2214112de23d1319ce77736547fa8900384e7c,trivial merge

And in JSCover.csv

idx,branch_name,merge_commit,parent_1,parent_2,notes
1,refs/heads/master,31361bbe2317967a9c47cfc4437e2ce706f79056,fe64f8c1071a356027e49247fd506edc23566c4d,ab9b573de54d1014b5d9822adc9d9e5cf43ba8c0,
2,refs/heads/master,fe64f8c1071a356027e49247fd506edc23566c4d,b9ead12fd4a28a9fd7074ad4e97128943b072828,f5cb9ece38b7bd9ae97d7a55490476b84a2336df,trivial merge
3,refs/heads/master,d6f0d1998a8db93117ba9c261d3b35d03633dc59,0eca5983951641b5d0884e6ce8fd162185d0cf05,d686e46aa0582b7dbee1f0d8664d6cb4b56600ec,trivial merge
4,refs/heads/master,f7a4cc17c3b97f600f5a982522100d2b0200a981,f3876ebe83363de9387bba9ec14d436e2d74d2b3,f59722bd1d3dcbf9fc183a49ee0e28ea3c42edf8,trivial merge
5,refs/heads/master,526eaed4a82ee4ec9bb018e6d541c848d2bfdd0d,1adf11c82653576753af0502bfc1f63900d85780,e562b7ed71187c95e310f85e99d1325dbd49e331,trivial merge
6,refs/heads/master,d780b34088c77fa4a5c7645b84d744f33ed0329f,428706a429ed1a3dca09588e989597fd9a9cb673,3a24d6bb388dfcf2696f71833f299e051e88cf1c,trivial merge
7,refs/heads/master,5dc4fe25315e31fb1d6c3f58d802283634b6e37b,428706a429ed1a3dca09588e989597fd9a9cb673,26335cfc92a5aaa2df829c7ba52ac834ad1c4080,trivial merge
8,refs/heads/master,9572ad5cfd2f4c20700e27f16c9e906e009d7137,c92c401a4335465b77e3bf95c47a3297fa633b5c,62d33c1ccc686ed687d04c8e874f0c36bbcfc43e,trivial merge
9,refs/heads/master,9ed0dc2b5892b2b2a782435b0a5dd42f90737b97,4ba5c58dfb6bf266259b80ab2fe19057910b6dba,7d6f336981e2d3c0a3e81fc663b6e4590f8aab0b,trivial merge
10,refs/heads/master,c5bb794c4ca20c492818cdc13d4c72ad57961835,d709ecd47d92508d22f60e288fbe49b36f4b5246,9539280c2e350f6104b01c366ad8557e174f46da,trivial merge
11,refs/heads/master,58557bba03e3cd911b334f451d7bf20c84d54045,0eee54986010067c05972426937c3459b979a840,a4fb3645a032633128f78786d779101b46a16581,trivial merge
12,refs/heads/master,5c0f3bc432afa70ae19ced28fd4655effffa643f,bf98af8c6a8559d82a24d8f6f11d5fc23d29523e,f04fe03a282ca2520de93a3f13fe4c06146e5823,trivial merge
13,refs/heads/master,885ba1e858932e9715e9b27a722a129e2611caab,c5f2600a6830cb2fc14647c98a9038832d20677f,48ce23c6b5857dd37a5658eb544fad390a4e3d44,trivial merge
14,refs/heads/master,ba5decc5f42a1fd11825415612eaf104b8879c9a,b387036e5d7bba871c330469f2143c242f12a0d5,95d545f377cf84ccd52b54dc0c644d18daa447a6,trivial merge
16,refs/heads/master,89ffb42fa4b9efa275282cd90369be4423441a94,5a3b21b0db2cbc45f1dccbeae832654c4b5c812e,b8d19bf433f60c02f28d046702a1ea86a157ecf6,trivial merge
17,refs/heads/master,c4db946b2d372f8bde079792b1cee93ac36921d2,3d76e008055d175af8fdbb90d6c5a0ced62faf0d,4534140b2526b36b39bcdf6d417d977db7a5cb1a,trivial merge
18,refs/heads/master,ea0851ac5e27d2a1a13acd2110094c46c44a849e,c0603903e4b5caeb83d2c93ceabba1e9ea59af19,e405a7f2926be3103a388e990701af1f260bc905,
19,refs/heads/master,2a9f0eb4e7a59587ef0fb8a9312ba5edf0078145,92d6114663b80b24d1f9bb6d282138cf6094857e,b9a2ee7526cf561aa2ff511fe9872792bd315e91,
20,refs/heads/master,75e03285fddc0e2510c4f20c56ac607702fc3c9a,56dc97e7850da6b5a904b57c5e288cbb89fde81d,46b585e31b43ec0d1740395b38d15fd4e4f8c60f,
21,refs/heads/master,10f916b7a0fd5953ae6f684cadd28fa87e9124e4,0fb55976f6e30ddcc8ae7af393ee0dd8d14e20e9,0256cc6b3b9f9470458ff418e6648c635c20e787,
22,refs/heads/master,0fb55976f6e30ddcc8ae7af393ee0dd8d14e20e9,b9f47e216af0bc46830e0cb0c6964218511bbb3f,974ce1770dc602051b320619e240603cc5358d64,
23,refs/heads/master,00c10a4e7f42a484209a443934c833c5970f462f,8e0ad29ac0d484db3eccc4bee791b72cc6222e4c,530d40378d2d17276f71b18f871637fea2937ecf,
24,refs/heads/master,adf6fd024df05daf3236ce42e8cd90b5e4448131,dca3e3596b0b880e9eb4406490488ce93bc9116a,0575f5e918deec26aba2eb2a18f0e87af679d81b,trivial merge
25,refs/heads/master,3ae921c45ac9a68844bbdc1d53a8a7ca262fb6b7,8fb1d13ac9c28ccf9723a4328141816f224337b9,aec694fe670b35eb827c473f5fbaf205cd59a0ba,
26,refs/heads/master,ac65c8f97187f99331093624c7829d3688a50297,fb4312ddbbf301484ff0e1243cfdf12d6a2fb840,c8937ebc236d924201af98021a50091ab0ee013a,
27,refs/remotes/origin/gh-pages,9217160ed01a2bb7e57cd07001fc0da20135ad21,49868943e60fdb40de33c4416aae6a8517ef3d7f,bd0c5960622a73573c9fa5fb0f356ba575c3676c,trivial merge
28,refs/remotes/origin/pull/94,c905e6a5da47ddf684d6cb6b6c52a6d5cb0d2e3b,390eb1dbfc709f2b75889697a9296ffeb06424a4,2e483d3cd3abbb5b6af28ea519278f597ccded96,

The index for line 15 seems to behave very randomly. Removing the parallel execution (as done in branch v2-update) yields instead:
Algorithms.csv:

idx,branch_name,merge_commit,parent_1,parent_2,notes
1,refs/heads/master,44ce61beeb8bb5c5c1289db02f1da53f6bff6d8a,e121b3943958ba1558d8828db6f13c632e99f298,58129e828567fac52e821ca147feeab87a95d780,trivial merge
2,refs/heads/master,e866ae7916d6b763f7d32f72f54edd735b5ea6bc,ae5dac889d19edd6c773e60420f00d4e7177f31f,4342b16e18da9082b32d5b8e9b5b6e3378461ce4,
3,refs/heads/master,ae5dac889d19edd6c773e60420f00d4e7177f31f,38ec459e6f163991c6f88210720dcda4fecc64e8,eb3260404e596031bc9ee037b29bd1fcbb87f625,trivial merge
4,refs/heads/master,38ec459e6f163991c6f88210720dcda4fecc64e8,60d87e9a120e330eb34c9d558d0f00e52b5ce438,aa882141f26ef82107658c214ec18a799403bb57,trivial merge
5,refs/heads/master,d7e0b1c03fa7cacf40b32dcfde30871e8947445a,2c4e69eb243a81ba05d24456b3cad7e8a696f630,8a1a1a7fb53732edaf17ef9876545c4bca568e05,
6,refs/heads/master,60d87e9a120e330eb34c9d558d0f00e52b5ce438,bceffabe0d850ea9c67c93a9fb578d2ca8600022,9aa3611952f1715d8f5cf6404a3dad0e95bf175f,trivial merge
7,refs/heads/master,bceffabe0d850ea9c67c93a9fb578d2ca8600022,247b66cb2c7792336c5d1c2c1502a55d5222c823,67360ca7159643f3a9fc20f30e164b36a4e4f859,
8,refs/heads/master,247b66cb2c7792336c5d1c2c1502a55d5222c823,c7c288a43e324748f1f25ee72f9639d808c70b7e,7f89b8c5fbe85848e96a0cdaffb83664588c8474,trivial merge
9,refs/heads/master,c7c288a43e324748f1f25ee72f9639d808c70b7e,ffa4a6ccd332fad6dc1c5a269b3854a5a2bfff91,296f6b09446b63e8b036e4840879c2d890ae925d,trivial merge
10,refs/heads/master,216529baa98d8646b192b1c6897b01d3181f2ac6,ffa4a6ccd332fad6dc1c5a269b3854a5a2bfff91,deaa316e2165940e50e4e00e83e18d27417fb642,trivial merge
11,refs/heads/master,ffa4a6ccd332fad6dc1c5a269b3854a5a2bfff91,128ccf0f25f401f2218c03166a6f4ebc5ce21385,03d7c1872d5b17434fd460f7b8a0b8c9c90849f7,trivial merge
12,refs/heads/master,128ccf0f25f401f2218c03166a6f4ebc5ce21385,1f8f61c50d4e36ab8677b271a58648e90a2f1b28,22ee3ce03f74349ed3b0f36bfdf1dc119e2be70b,trivial merge
13,refs/heads/master,c8604d6f6d23a0cd5d3f9fac472dbf467fe11b8e,7ebdadbc657a68da22826ce3cc402e3635e80d4c,fd87cbc2cdbe813564f0a2c7e8070964220cdcc6,trivial merge
14,refs/remotes/origin/pull/27,fdd92255570bf37534ba36f632773923d60e32d2,8edac208c376e88626e88a720c9166fef9588853,1a71e9f95183a089b2b72a0cb42337681594e699,
15,refs/remotes/origin/pull/37,6b917cc6a33eb87fec2dcdafc845177f470ca374,8ef7da30196fba8e55947acf6168d8cd4a82f2b8,68f656ff10751259284d083e8fdf8e82f7ada15f,
16,refs/remotes/origin/pull/37,8ef7da30196fba8e55947acf6168d8cd4a82f2b8,37e80741a853a6626a77aed38caa07f4e44d85b2,8b2e3b06185e4a111ae987db7fa01901974036cd,
17,refs/remotes/origin/pull/37,680f2603dc41bb7c5843b0fa342e829210611d9d,1477ce25eaad76986f50aa5b740e2c023ff31de2,879fb60972e4e83fdb7b5337db66c8afbfb95ffc,
18,refs/remotes/origin/pull/37,d1279d1b59ecba7f01de8b2c1995562f92b2e255,b636ef254a97ae830baa6d783121de7890818785,9497e2f66d7af3feedd3106faa6e8b0aa6118b41,
19,refs/remotes/origin/pull/37,104eb9ed88cde145576ea1df50460915af488e74,6baa2ed219e18f9763bb2f2de74c5fb800d15899,99a155f67dfe68de440feb560a7e4c4ee5ae355c,
20,refs/remotes/origin/pull/73,9c8740d0e38ba93f08bfb462d16cab448a0197f1,7ebdadbc657a68da22826ce3cc402e3635e80d4c,4e2214112de23d1319ce77736547fa8900384e7c,trivial merge

In JSCover.csv:

idx,branch_name,merge_commit,parent_1,parent_2,notes
1,refs/heads/master,31361bbe2317967a9c47cfc4437e2ce706f79056,fe64f8c1071a356027e49247fd506edc23566c4d,ab9b573de54d1014b5d9822adc9d9e5cf43ba8c0,
2,refs/heads/master,fe64f8c1071a356027e49247fd506edc23566c4d,b9ead12fd4a28a9fd7074ad4e97128943b072828,f5cb9ece38b7bd9ae97d7a55490476b84a2336df,trivial merge
3,refs/heads/master,d6f0d1998a8db93117ba9c261d3b35d03633dc59,0eca5983951641b5d0884e6ce8fd162185d0cf05,d686e46aa0582b7dbee1f0d8664d6cb4b56600ec,trivial merge
4,refs/heads/master,f7a4cc17c3b97f600f5a982522100d2b0200a981,f3876ebe83363de9387bba9ec14d436e2d74d2b3,f59722bd1d3dcbf9fc183a49ee0e28ea3c42edf8,trivial merge
5,refs/heads/master,526eaed4a82ee4ec9bb018e6d541c848d2bfdd0d,1adf11c82653576753af0502bfc1f63900d85780,e562b7ed71187c95e310f85e99d1325dbd49e331,trivial merge
6,refs/heads/master,d780b34088c77fa4a5c7645b84d744f33ed0329f,428706a429ed1a3dca09588e989597fd9a9cb673,3a24d6bb388dfcf2696f71833f299e051e88cf1c,trivial merge
7,refs/heads/master,5dc4fe25315e31fb1d6c3f58d802283634b6e37b,428706a429ed1a3dca09588e989597fd9a9cb673,26335cfc92a5aaa2df829c7ba52ac834ad1c4080,trivial merge
8,refs/heads/master,9572ad5cfd2f4c20700e27f16c9e906e009d7137,c92c401a4335465b77e3bf95c47a3297fa633b5c,62d33c1ccc686ed687d04c8e874f0c36bbcfc43e,trivial merge
9,refs/heads/master,9ed0dc2b5892b2b2a782435b0a5dd42f90737b97,4ba5c58dfb6bf266259b80ab2fe19057910b6dba,7d6f336981e2d3c0a3e81fc663b6e4590f8aab0b,trivial merge
10,refs/heads/master,c5bb794c4ca20c492818cdc13d4c72ad57961835,d709ecd47d92508d22f60e288fbe49b36f4b5246,9539280c2e350f6104b01c366ad8557e174f46da,trivial merge
11,refs/heads/master,58557bba03e3cd911b334f451d7bf20c84d54045,0eee54986010067c05972426937c3459b979a840,a4fb3645a032633128f78786d779101b46a16581,trivial merge
12,refs/heads/master,5c0f3bc432afa70ae19ced28fd4655effffa643f,bf98af8c6a8559d82a24d8f6f11d5fc23d29523e,f04fe03a282ca2520de93a3f13fe4c06146e5823,trivial merge
13,refs/heads/master,885ba1e858932e9715e9b27a722a129e2611caab,c5f2600a6830cb2fc14647c98a9038832d20677f,48ce23c6b5857dd37a5658eb544fad390a4e3d44,trivial merge
14,refs/heads/master,ba5decc5f42a1fd11825415612eaf104b8879c9a,b387036e5d7bba871c330469f2143c242f12a0d5,95d545f377cf84ccd52b54dc0c644d18daa447a6,trivial merge
15,refs/heads/master,89ffb42fa4b9efa275282cd90369be4423441a94,5a3b21b0db2cbc45f1dccbeae832654c4b5c812e,b8d19bf433f60c02f28d046702a1ea86a157ecf6,trivial merge
16,refs/heads/master,c4db946b2d372f8bde079792b1cee93ac36921d2,3d76e008055d175af8fdbb90d6c5a0ced62faf0d,4534140b2526b36b39bcdf6d417d977db7a5cb1a,trivial merge
17,refs/heads/master,ea0851ac5e27d2a1a13acd2110094c46c44a849e,c0603903e4b5caeb83d2c93ceabba1e9ea59af19,e405a7f2926be3103a388e990701af1f260bc905,
18,refs/heads/master,2a9f0eb4e7a59587ef0fb8a9312ba5edf0078145,92d6114663b80b24d1f9bb6d282138cf6094857e,b9a2ee7526cf561aa2ff511fe9872792bd315e91,
19,refs/heads/master,75e03285fddc0e2510c4f20c56ac607702fc3c9a,56dc97e7850da6b5a904b57c5e288cbb89fde81d,46b585e31b43ec0d1740395b38d15fd4e4f8c60f,
20,refs/heads/master,10f916b7a0fd5953ae6f684cadd28fa87e9124e4,0fb55976f6e30ddcc8ae7af393ee0dd8d14e20e9,0256cc6b3b9f9470458ff418e6648c635c20e787,
21,refs/heads/master,0fb55976f6e30ddcc8ae7af393ee0dd8d14e20e9,b9f47e216af0bc46830e0cb0c6964218511bbb3f,974ce1770dc602051b320619e240603cc5358d64,
22,refs/heads/master,00c10a4e7f42a484209a443934c833c5970f462f,8e0ad29ac0d484db3eccc4bee791b72cc6222e4c,530d40378d2d17276f71b18f871637fea2937ecf,
23,refs/heads/master,adf6fd024df05daf3236ce42e8cd90b5e4448131,dca3e3596b0b880e9eb4406490488ce93bc9116a,0575f5e918deec26aba2eb2a18f0e87af679d81b,trivial merge
24,refs/heads/master,3ae921c45ac9a68844bbdc1d53a8a7ca262fb6b7,8fb1d13ac9c28ccf9723a4328141816f224337b9,aec694fe670b35eb827c473f5fbaf205cd59a0ba,
25,refs/heads/master,ac65c8f97187f99331093624c7829d3688a50297,fb4312ddbbf301484ff0e1243cfdf12d6a2fb840,c8937ebc236d924201af98021a50091ab0ee013a,
26,refs/remotes/origin/gh-pages,9217160ed01a2bb7e57cd07001fc0da20135ad21,49868943e60fdb40de33c4416aae6a8517ef3d7f,bd0c5960622a73573c9fa5fb0f356ba575c3676c,trivial merge
27,refs/remotes/origin/pull/94,c905e6a5da47ddf684d6cb6b6c52a6d5cb0d2e3b,390eb1dbfc709f2b75889697a9296ffeb06424a4,2e483d3cd3abbb5b6af28ea519278f597ccded96,

I was thinking of using AtomicIntegers but I feel like this integer shouldn't even be shared between the process in the first place. For now I removed the parallelism in v2-update.

`repos_small.csv` column headers

Please add a header for the first column of repos_small.csv.
Why is this column different than the first column of repos_small_with_hashes.csv?

FindMergeCommits takes a huge amount of time

Currently, on the run-cf branch the script src/main/java/astmergeevaluation/FindMergeCommits.java takes a huge amount of time. It has been running for more than 12h and still has not finished. Three questions:

  1. Is there a way to make it work more efficiently for one repo?
  2. Is there a way to parallelize the code when we have multiple repos?
  3. Is there a way to have some progress bar or progress report, so we can have some ETA estimate?

Bug in find_merge_commits.sh in the run_cf branch

The following problem is printed when running ./src/scripts/find_merge_commits.sh cf/local_repos_cf.csv cf/merges_cf:

./src/scripts/find_merge_commits.sh: line 153: [[: 1
1: syntax error in expression (error token is "1")
./src/scripts/find_merge_commits.sh: line 153: [[: 1
1
1: syntax error in expression (error token is "1
1")
./src/scripts/find_merge_commits.sh: line 153: [[: 1
1: syntax error in expression (error token is "1")

rerere for merging

Here is another merging configuration:

[rerere]
	enabled = true

This requires running

  rerere-train.sh HEAD

before running the merge. Users would only have to do this once, but we will have to do it every time. We should not count the cost of this command against the merge tool, in our run-time comparisons.

FindMergeCommits bug

There is a bug current on the main branch with FindMergeCommits. The following error is raised when running run_full.sh

Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.Error: Histories are unrelated for getMergeBaseCommit(Repository[/tmp/scheschb/ast-merge-eval-data/jlinn__quartz-redis-jobstore], commit d215c1856fe1b0dd7cba46f4cece5e1c20fc1d5e 1406072839 -----sp, commit 191d82f0448bf895e370618354d696000e66c4c9 1406072846 -----sp)
Finding merge commits for AlbinTheander/MeQanTT             DONE
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
        at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:562)
        at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:591)
        at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:689)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
        at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:765)
        at astmergeevaluation.FindMergeCommits.writeMergeCommitsForRepos(FindMergeCommits.java:192)
        at astmergeevaluation.FindMergeCommits.main(FindMergeCommits.java:105)
Caused by: java.lang.Error: java.lang.Error: Histories are unrelated for getMergeBaseCommit(Repository[/tmp/scheschb/ast-merge-eval-data/jlinn__quartz-redis-jobstore], commit d215c1856fe1b0dd7cba46f4cece5e1c20fc1d5e 1406072839 -----sp, commit 191d82f0448bf895e370618354d696000e66c4c9 1406072846 -----sp)
        at astmergeevaluation.FindMergeCommits.writeMergeCommitsForRepo(FindMergeCommits.java:210)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
        at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: java.lang.Error: Histories are unrelated for getMergeBaseCommit(Repository[/tmp/scheschb/ast-merge-eval-data/jlinn__quartz-redis-jobstore], commit d215c1856fe1b0dd7cba46f4cece5e1c20fc1d5e 1406072839 -----sp, commit 191d82f0448bf895e370618354d696000e66c4c9 1406072846 -----sp)

gitmerge_resolve adds random strings to conflict markers

It seems like gitmerge_resolve adds some slightly random information to its conflict markers and this information is not consistent between runs. The problem is that this interferes with the tree fingerprint so it would be great if we could overwrite them with a standard marker that is consistent across runs. Here is an example of a diff that is causing a different tree fingerprint even though they are the exact same merges.


(base) scheschb@scheschb-System-Product-Name:~/Git/AST-Merging-Evaluation$ diff -r .workdir/43e711ac535d41109165021337be2a19/Al
gorithms/src .workdir/37762a63e51a4d41ad784636bf37d3c3/Algorithms/src
diff -r .workdir/43e711ac535d41109165021337be2a19/Algorithms/src/main/java/com/github/pedrovgs/problem8/SplitArray.java .workdi
r/37762a63e51a4d41ad784636bf37d3c3/Algorithms/src/main/java/com/github/pedrovgs/problem8/SplitArray.java
55c55
< <<<<<<< .merge_file_GQVSWL
---
> <<<<<<< .merge_file_l8fV4M
79c79
< >>>>>>> .merge_file_J95AQ3
---
> >>>>>>> .merge_file_F676Ey
100c100
< <<<<<<< .merge_file_GQVSWL
---
> <<<<<<< .merge_file_l8fV4M
145c145
< >>>>>>> .merge_file_J95AQ3
---
> >>>>>>> .merge_file_F676Ey

Warnings when running `make small-test`

I get these warnings when I run make small-test:

merge_tester: Done
Number of inconsistent entries: 0
Number of trivial merges: 3
Number of failed trivial merges (will be ignored): 0
/home/mernst/.venv/lib/python3.11/site-packages/seaborn/matrix.py:202: RuntimeWarning: All-NaN slice encountered
  vmin = np.nanmin(calc_data)
/home/mernst/.venv/lib/python3.11/site-packages/seaborn/matrix.py:207: RuntimeWarning: All-NaN slice encountered
  vmax = np.nanmax(calc_data)

Could you please ensure that make small-test does not yield any warnings?

Binary for maven 3.9.2 doesn't exist

I only see 3.8.8 and 3.9.3 on the website. The current link to 3.9.2 doesn't exist, and so the make command for downloading fails. We should probably update this to 3.9.3, and maybe think of a way to future-proof this

Don't require setting PATH

When I run the "small-test" locally:

./run_small.sh --include_trivial_merges

I get this error:

git-hires-merge is not on the PATH

Since that is in this repository, I think the Makefile (or some other part of the system) should set the PATH appropriately rather than requiring the user to do that.

FindMergeCommits fills up /tmp

/tmp is full again :(. The script crashed due to no storage space remaining. The script leaves many files in scheschb/ast-merge-eval-data/

[scheschb@tricycle tmp]$ find scheschb/ast-merge-eval-data/ -user scheschb | wc -l
281415

Return values

Why do repo_test and head_passes_test return a string rather than a boolean?
Also, the possible string values that these routines return is not documented.

run_full.sh crashes on clean checkout

If you run these commands:

mkdir /tmp/${USER}
cd /tmp/${USER}
rm -rf AST-Merging-Evaluation
git clone --depth 1 https://github.com/benedikt-schesch/AST-Merging-Evaluation.git
cd AST-Merging-Evaluation

Then you get the following crash:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.11/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/tmp/mernst/AST-Merging-Evaluation/src/python/validate_repos.py", line 233, in head_passes_tests
    status = commit_pass_test(
             ^^^^^^^^^^^^^^^^^
  File "/tmp/mernst/AST-Merging-Evaluation/src/python/validate_repos.py", line 171, in commit_pass_test
    status, _ = read_cache(target_file)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/mernst/AST-Merging-Evaluation/src/python/validate_repos.py", line 136, in read_cache
    with open(cache_file + "_explanation.txt", "r") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'cache/test_result/fernandohur_rabbit-pusher_c9a5016a9c7d91820f764fda15f9db8085755c6f_explanation.txt'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tmp/mernst/AST-Merging-Evaluation/src/python/validate_repos.py", line 268, in <module>
    return_value = result.get(2 * TIMEOUT_TESTING)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/pool.py", line 774, in get
    raise self._value
FileNotFoundError: [Errno 2] No such file or directory: 'cache/test_result/fernandohur_rabbit-pusher_c9a5016a9c7d91820f764fda15f9db8085755c6f_explanation.txt'

Bug in FindMergeCommits

When executing run_full.sh currently on the main branch, the following error is raised:

Exception in thread "main" java.io.FileNotFoundException: /tmp/scheschb/ast-merge-eval-data/tntim96__JSCover/config (Is a directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at org.eclipse.jgit.util.io.SilentFileInputStream.<init>(SilentFileInputStream.java:31)
        at org.eclipse.jgit.util.IO.readFully(IO.java:101)
        at org.eclipse.jgit.util.IO.readFully(IO.java:48)
        at org.eclipse.jgit.storage.file.FileBasedConfig.load(FileBasedConfig.java:127)
        at org.eclipse.jgit.lib.BaseRepositoryBuilder.loadConfig(BaseRepositoryBuilder.java:732)
        at org.eclipse.jgit.lib.BaseRepositoryBuilder.getConfig(BaseRepositoryBuilder.java:709)
        at org.eclipse.jgit.lib.BaseRepositoryBuilder.guessWorkTreeOrFail(BaseRepositoryBuilder.java:744)
        at org.eclipse.jgit.lib.BaseRepositoryBuilder.setupWorkTree(BaseRepositoryBuilder.java:675)
        at org.eclipse.jgit.lib.BaseRepositoryBuilder.setup(BaseRepositoryBuilder.java:602)
        at org.eclipse.jgit.internal.storage.file.FileRepository.<init>(FileRepository.java:131)
        at astmergeevaluation.FindMergeCommits.writeMergeCommits(FindMergeCommits.java:252)
        at astmergeevaluation.FindMergeCommits.writeMergeCommitsForRepos(FindMergeCommits.java:196)
        at astmergeevaluation.FindMergeCommits.main(FindMergeCommits.java:104)

`FindMergeCommits` Performance

It seems like FindMergeCommits struggles with certain repos to list all possible merges. It might not be useful to sample more than 10k+ merges either way, but it would be great for the script to terminate on its own or to be fast enough to process everything.
To reproduce, checkout the branch merge_analyzer and run: sh run_xp.sh which gets stuck on FindMergeCommits because the repo enonic/xp contains too many merges.

NUM_OF_PARENTS sometimes is not a number

Run the following, where the code is taken from find_merge_commits.sh:

ORG_AND_REPO=typetools/checker-framework
PR_NUMBER=5582
GH_RES=$(gh api \
    -H "Accept: application/vnd.github+json" \
    --method GET \
    --paginate \
    "/repos/$ORG_AND_REPO/pulls/$PR_NUMBER/commits")
i=0
NUM_OF_PARENTS=$(echo "$GH_RES" | jq --arg i "$i" '.[($i | tonumber)].parents | length')
echo "NUM_OF_PARENTS=\"$NUM_OF_PARENTS\""

The final output is:

NUM_OF_PARENTS="1
1"

I suspect that NUM_OF_PARENTS is supposed to be a single number rather than a list of multiple numbers.

Bug in find_merge_commits.sh

When running find_merge_commits.sh the following issue is raised:


benediktschesch@student-net-hg-dock-1-b-431 AST-Merging-Evaluation % sh src/scripts/find_merge_commits.sh small/valid_repos.csv small/merges_small

mangstadt/ez-vcard
pedrovgs/Algorithms
mangstadt/ez-vcard
ez-vcard
small/merges_small/ez-vcard.csv
Cloning into 'ez-vcard'...
remote: Enumerating objects: 39484, done.
remote: Counting objects: 100% (2867/2867), done.
remote: Compressing objects: 100% (478/478), done.
remote: Total 39484 (delta 2344), reused 2855 (delta 2335), pack-reused 36617
Receiving objects: 100% (39484/39484), 56.95 MiB | 14.99 MiB/s, done.
Resolving deltas: 100% (31916/31916), done.
branch '0.7.3' set up to track 'origin/0.7.3'.
Switched to a new branch '0.7.3'
Updating files: 100% (23301/23301), done.
branch 'gh-pages' set up to track 'origin/gh-pages'.
Switched to a new branch 'gh-pages'
Updating files: 100% (23470/23470), done.
branch 'scribes' set up to track 'origin/scribes'.
Switched to a new branch 'scribes'
branch 'vCard4' set up to track 'origin/vCard4'.
Switched to a new branch 'vCard4'
branch 'wiki' set up to track 'origin/wiki'.
Switched to a new branch 'wiki'
pedrovgs/Algorithms
Algorithms
small/merges_small/Algorithms.csv
Cloning into 'Algorithms'...
remote: Enumerating objects: 4996, done.
remote: Total 4996 (delta 0), reused 0 (delta 0), pack-reused 4996
Receiving objects: 100% (4996/4996), 488.68 KiB | 8.14 MiB/s, done.
Resolving deltas: 100% (1363/1363), done.
branch 'dependabot/maven/junit-junit-4.13.1' set up to track 'origin/dependabot/maven/junit-junit-4.13.1'.
Switched to a new branch 'dependabot/maven/junit-junit-4.13.1'
remote: Enumerating objects: 1, done.
remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 1
Unpacking objects: 100% (1/1), 616 bytes | 308.00 KiB/s, done.
From https://github.com/pedrovgs/Algorithms
 * [new ref]         refs/pull/73/head -> 73
Switched to and reset branch '73'
src/scripts/find_merge_commits.sh: line 168: MERGE_PARENTS[1]: unbound variable

FindMergeCommits stops early

It seems like FindMergeCommits terminates early. When you run rm -rf results/merges and then sh run_full.sh I only get 124 repos with merges out of the ~850 that should be there. Running the script ~5 times gives the full result but then I am not sure if the first runs are the full results or just partial results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.