ualberta-smr / merganser Goto Github PK
View Code? Open in Web Editor NEWMerganser is a scalable and extendable tool for analyzing merge scenarios in git repositories
Home Page: https://ualberta-smr.github.io/merganser/
License: MIT License
Merganser is a scalable and extendable tool for analyzing merge scenarios in git repositories
Home Page: https://ualberta-smr.github.io/merganser/
License: MIT License
It would be good if there's a parameter to indicate the starting date of the mining (i.e., mine commits after date X). In that case, we should check if the repo is already cloned and if it is, we shouldn't clone again.
I would suggest that you encourage people to use Issues for bugs or even questions rather than to email us, since this allows better tracking of things.
I would also suggest adding a Contributors list, where you can have both our names (with yours first as the main author) and contact information.
Maybe change to "Temp variables for reading username and repo"
The current code is using setup.py
which helps us to install the package py pip
. Another option would be using docker
.
Do we need to have a docker file too?
Document reason for or (mentioning the weird scenario you ran to if you can look it up)
When running the tests, a project may be using maven but has no test suites. Such projects should be clearly marked. E.g., -1 as the value of the passing test or some other field marked for the project
If you could not run the build or test for any reason, or the repo doesn't support mvn, then store -2 (And update documentation in ReadMe or data schema description) and log a failure
For the -q
flag for the search module, what if you have multiple search terms, should you use quotes? If so, clarify in instructions/example.
There is encoding error while parsing the output of git to extract the conflicting regions since some commit messages use non-ASCII characters.
What are these:
PREDICTION_CSV_PATH | The directory path for storing the data of conflict prediction |
---|---|
PREDICTION_CSV_DATA_NAME | The file name of the data of conflict prediction |
PREDICTION_CSV_LABEL_NAME | The file name of the labels of conflict prediction |
They are not described anywhere?
I'm assuming that if all the max parameters (e.g., max number of days, merge scenarios etc.) are not set, that all the merge scenarios in a given repo are analyzed. Is that correct?
The wiki page says "The directory path for reading the list of repositories, as *.txt files", which suggests that you may have multiple input list files for the set of repos to analyze. In the ReadMe instructions, it suggests that there is only one list
I think the command may return multiple branches and you only consider the first one. It would be good to verify if it does indeed return multiple branches and if for our purposes, considering the first branch is enough
validation_repository_name
--> validate_repository_name
Please check all code.
Maybe change to get_build_status
The checksumdir
pytohnlibrary reports is cannot find the directory and returns Directory Not Found
eror.
There are several warnings when inserting data into the Merge_Related_Commit Table.
In the parameter page, add the default values for the parameters you indicate don't necessarily need to be changed?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.