mona-actions / gh-repo-stats Goto Github PK
View Code? Open in Web Editor NEWGH CLI extension to pull statistics on repository metadata used in GitHub migrations
License: MIT License
GH CLI extension to pull statistics on repository metadata used in GitHub migrations
License: MIT License
If more changes are introduced to the current script, stable script can break. Although an extensive Pull Request review can prevent that, it is hard to ensure what works or not without an extensive guideline and testing.
There should be an additional CI pipeline and associated scripts to automate the code testing process.
Close #70, close #73, close #76.
BREAKING: this replaces the installation_id
and repository
inputs with installation_retrieval_mode
and installation_retrieval_payload
to also support organization and user installation.
It also:
repositories
input to scope the created token to a subset of repositories.post
script.Originally posted by @tibdex in tibdex/github-app-token#84
Strangely that didn't happen to me. I wonder if it was using my `GITHUB_TOKEN` that is in my environment?
Anyhow, could gh auth
allow this to use a github app for authentication so we get 15k requests per hour?
Is the gh
tool even the right spot to make that request?
Originally posted by @Chocrates in #26 (comment)
Im trying to use against GHE server 2.22:
gh repo-stats -H company_ghe_server --org company_org
######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################
GitHub Enterprise Server v2.22.24
------------------------------------------------------
Getting repositories for org: company_org
gh: Rate limiting is not enabled. (HTTP 404)
API rate limiting is not enabled.
gh: Field 'discussions' doesn't exist on type 'Repository'
Error getting Repos for Org: company_org
--repo-page-size might need to be set to a lower value
######################################################
The script has completed
Results file:[company_org-all_repos-202311101323.csv]
######################################################
****
Originally posted by @ju2wheels in #75
When the code checks for rate limit (https://github.com/mona-actions/gh-repo-stats/blob/main/gh-repo-stats#L651)
API_REMAINING_MESSAGE=$(echo "${API_REMAINING_REQUEST}" \
| jq -r '.message' 2>&1)
It is trying to extract message
field, but Rate Limit API end point does not have a field message
(https://docs.github.com/en/enterprise-cloud@latest/rest/rate-limit#get-rate-limit-status-for-the-authenticated-user), meaning that this will always comes out null.
Im trying to use against GHE server 2.22:
gh repo-stats -H company_ghe_server --org company_org
######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################
GitHub Enterprise Server v2.22.24
------------------------------------------------------
Getting repositories for org: company_org
gh: Rate limiting is not enabled. (HTTP 404)
API rate limiting is not enabled.
gh: Field 'discussions' doesn't exist on type 'Repository'
Error getting Repos for Org: company_org
--repo-page-size might need to be set to a lower value
######################################################
The script has completed
Results file:[company_org-all_repos-202311101323.csv]
######################################################
****
This is a message from the GitHub CLI team, maintainers of gh
, writing to inform you that the most recent release of gh
contains changes which may affect your extension. The latest release introduces the feature of storing authentication tokens in the system keyring (encrypted storage) instead of in a plain text file.
The keyrings that are supported are:
Keychain on macOS
GNOME Keyring on Linux (Secret Service dbus interface)
Wincred on Windows
This has huge security benefits for the users of our tool and was one of our oldest outstanding issues. Unfortunately this change has the potential to break extensions that rely on utilizing the users authentication token to work.
In order to have continued compatibility with gh
there are some actions you, as an extension author, need to take. These actions will depend on the implementation of your extension.
Upgrade your go-gh
version to v1.2.1, the latest version.
go get github.com/cli/[email protected]
Verify that in your extension retrieval of the user authentication token is done using the auth.TokenForHost
function.
Verify that in your extension retrieval of the user authentication token is done by shelling out to the gh auth token
command.
gh config get
command, reading the configuration file directly, or any other methods it will no longer work.As of right now storing the authentication token in the system keyring is an opt-in feature, but in the near future it will be required and at that point if the changes above are not made then your extension will be broken for all users. If you have any questions/concerns about this change please feel free to open a discussion in the gh repo.
Thanks,
The GitHub CLI Team
How do I correct the vs code extension error: Extensions: Skipped synchronizing extension because the extension is not found?
The sync doesn't seem to migrate my extensions from one device to another. I have also tried copying them to c:/users/name/.vscode/extensions folder without success.
Any assistance you can provide is greatly appreciated.
Thank you.
Originally posted by @UTexas80 in Visual-Studio-Code/.github#43
While working on a customer engagement, I made a recommendation for the extension to be used with a token create from a GitHub App. It turns out that gh-repo-stats
fails because it designed around the idea the PAT belongs to a user.
The error revolves around getting the user identity and checking admin rights in https://github.com/mona-actions/gh-repo-stats/blob/main/gh-repo-stats#L537
I think there are use cases where this extension needs higher rate limits than PATs, so I'd love to see it get updated to include information and changes necessary to use a GitHub App token instead.
Ya ya
Originally posted by @JSON2MAZA in #25 (comment)
CSV
optionTo understand the need for a code of conduct for mona-actions/gh-repo-stats
and adopting one.
One of the helpful aspects of GitHub CLI extensions is their ability to delegate authorization issues to be handled by gh auth
. This allows interpreted and precompiled extensions to delegate this responsibility, focusing on the core functionality.
With initially running the gh-repo-stats
extension, it prompted and required a personal access token (PAT) when this was already setup via standard GH CLI methods.
$ gh repo-stats --org $(whoami)
######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################
------------------------------------------------------
Please create a GitHub Personal Access Token used to gather
information from your Organization, with a scope of 'repo',
followed by [ENTER]:
(note: your input will NOT be displayed)
It would be nice if the extension was enhanced in a way where it did not need my PAT explicitly. Looking at the underlying interpreted script, I think this might be doable if all of the curl
calls were instead leveraging gh api
or other gh
subcommands itself.
Looks like linter is complaining to double quote to prevent globbing. If we do the expressions as string based expressions, I think we should be fine.
Originally posted by @ssulei7 in #22 (review)
Is there a reason why :
Are PR_Review_Comment_Count
the same as comments on a PR?
I'm working with a customer leveraging the gh-repo-stats
extension as they find the final evaluation of whether there might be issues as helpful for migration planning. Thank you for building and maturing this extension! 🎉 🙇
@saharora and I ran into an issue after ~67 repositories were assessed. In a rough estimation, 100 API calls are being made for each repository, assuming no other activity against the PAT was going on at the time. I need to look at the underlying code again to see how much of the information is coming from GraphQL or not to see if there might be room for optimization.
(env) % gh repo-stats --org XXXXXXXX --repo-page-size 10
######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################
------------------------------------------------------
Please create a GitHub Personal Access Token used to gather
information from your Organization, with a scope of 'repo',
followed by [ENTER]:
(note: your input will NOT be displayed)
Creating file header...
------------------------------------------------------
Getting repositories for org: XXXXXXXXXXX
[5000] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[4048] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Error getting more Pull Requests for Repo: XXXXXXXXXX
{
"data": null,
"errors":[
{
"message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `267E:58B0:29D30:361FB:62B205EF` when reporting this issue."
}
]
}
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[3615] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[3390] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[3165] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[2525] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
[1468] API attempts remaining...
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
Analyzing Repo: XXXXXXXXXX
ERROR --- Errors occurred while retrieving pull requests for repo: XXXXXXXXXX
{
"type": "RATE_LIMITED",
"message": "API rate limit exceeded for user ID 60168593."
}
jq: error (at <stdin>:0): Cannot iterate over null (null)
######################################################
ERROR! Failed response back from GitHub!
Please validate your PAT, Organization, and access levels!
######################################################
For some reason, the command is not working for me.
Windows 11
Organization is GHE and my authenticated account is an Organization Admin.
$ gh repo-stats -H github.com -o org-name
######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################
invalid API endpoint: "C:/Program Files/Git/user". Your shell might be rewriting URL paths as filesystem paths. To avoid this, omit the leading slash from the endpoint argument
Error getting user
invalid API endpoint: "C:/Program Files/Git/". Your shell might be rewriting URL paths as filesystem paths. To avoid this, omit the leading slash from the endpoint argument
invalid API endpoint: "C:/Program Files/Git/meta". Your shell might be rewriting URL paths as filesystem paths. To avoid this, omit the leading slash from the endpoint argument
Error getting GHE version
find: ‘./AppData/Local/Microsoft/Windows/INetCache/Low/Content.IE5’: Permission denied
find: ‘./AppData/Local/Razer/RazerAxon/WallpaperSource/RazerAxonWallPapers’: Permission denied
invalid API endpoint: "C:/Program Files/Git/orgs/us-southou-demo/memberships/". Your shell might be rewriting URL paths as filesystem paths. To avoid this, omit the leading slash from the endpoint argument
Error getting Membership for Org: org-name
While scanning an entire organization is nice, it can be quite painful when an organization has hundreds of repositories. My organization is migrating in chunks, so we would like to do analysis only on specific repositories when we're ready to migrate them.
Perhaps add a new parameter for a list of repositories and/or a new parameter for a file of listed repositories (or enhance the logic around the input
parameter to recognize repositories)?
I could expect that if repositories were from different organizations or if both a list of repositories and an organization was provided, then the tool could error out.
The ultimate output from the gh-repo-stat
is whether there is a migration issue, which currently is if any of the following conditions occur:
Unfortunately, adopters of gh-repo-stats
have no explanation of what to do if they encounter repositories that face this issue such as:
A couple minor improvement suggestions. Great tooling btw! 🏆
When using --repo-list, it would be nice to know which repository is being processed by showing that in the output. Currently there's only a rate limit callout with wording stating "getting repositories for org". For repos with thousands of records it can take a very long time to run, so knowing which repo it's on is helpful. Note that running against the entire org, without --repo-list, does show the repo being processed, so it seems to be limited to --repo-list only. See output below for example.
When using --repo-list, any repos included that don't exist are not identified in the output as an issue. They are not included in the .csv either, which I guess is expected, but if you have a large list you're left wondering why the count of the results is off from the number of repos in the repo-list file.
Possible Solutions:
status
, or something like that, in the .csv and include status of 'completed', 'not found', 'error'. If not found, then all stats could be 0 with the status 'not found'. If some of the stats were retrieved but not all. due to API limit or some other issue, having status of 'error' would let us know we're missing data.Sample of testing around the issues mentioned. Repo name and org were swapped to generic in the output below.
/github/migration/stats$ cat ./repo-list.txt
ActiveRepo
NonExistentRepo
/github/migration/stats$ gh repo-stats --org myorg --repo-list ./repo-list.txt
######################################################
######################################################
############# GitHub repo list and sizer #############
######################################################
######################################################
User provided a repo list file. Mapping contents...
------------------------------------------------------
Getting repositories for org: myorg
Rate limits remaining: 4978 GraphQL points 4962 REST calls
Rate limits remaining: 4977 GraphQL points 4960 REST calls
Rate limits remaining: 4975 GraphQL points 4958 REST calls
Rate limits remaining: 4974 GraphQL points 4957 REST calls
Gathered all repositories for org: myorg
######################################################
The script has completed
Results file:[myorg-all_repos-202305031643.csv]
######################################################
/github/migration/stats$ cat ./myorg-all_repos-202305031643.csv
Org_Name,Repo_Name,Is_Empty,Last_Push,Last_Update,isFork,Repo_Size(mb),Record_Count,Collaborator_Count,Protected_Branch_Count,PR_Review_Count,Milestone_Count,Issue_Count,PR_Count,PR_Review_Comment_Count,Commit_Comment_Count,Issue_Comment_Count,Issue_Event_Count,Release_Count,Project_Count,Branch_Count,Tag_Count,Discussion_Count,Has_Wiki,Full_URL,Migration_Issue
myorg,activerepo,false,2023-05-03T15:56:07Z,2023-04-27T21:10:23Z,false,56,56,19,2,0,0,2,11,0,0,1,21,0,0,7,0,0,true,https://github.com/myorg/activerepo,FALSE
How do I correct the vs code extension error: Extensions: Skipped synchronizing extension because the extension is not found?
The sync doesn't seem to migrate my extensions from one device to another. I have also tried copying them to c:/users/name/.vscode/extensions folder without success.
Any assistance you can provide is greatly appreciated.
Thank you.
Originally posted by @UTexas80 in Visual-Studio-Code/.github#43
Originally posted by @batmancahoon in #69
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.