Code Monkey home page Code Monkey logo

metrics's Introduction

TwitterOSS Metrics

General

This is the README for the TwitterOSS Metrics repo, which generates periodic reports based on the health of Twitter Open Source projects.

For more info, see twitter.github.io/metrics

Dependencies

Service Details
CHAOSS Augur Used to retrieve metrics such as Aggregate Summary, Bus Factor, and Repo Commits.
GitHub Actions Runs a weekly cron job that runs scripts in order to fetch data and generate reports.
GraphQL Directly used to fetch metrics from the GitHub GraphQL API.
Twitter Service Indirectly used for personal access token environment variable.
Metrics Dashboard Contains all reports for Repositories in repos-to-include.txt.
Slack reports Repo Runs a cron job and posts a message to slack with daily project activity based on metrics repo.
Year In Review Weekly updating, sliding window overview of past 12 months of activity on Twitter's Open Source Projects.

Service Outage Impact

If the service experiences problems:

  • Year in Review, Metrics Dashboard, and Slack Reports Repo will be unable to update.

Build

Environment Setup

  1. Clone Repo
    $ git clone https://github.com/twitter/metrics.git  
    $ cd ./metrics

Tracking new repositories and orgs

Edit repos-to-include.md

If you want to track an org and all its repositories which are hosted github.com/<org_name>, add <org_name>/* as a new line in repos-to-include.md. If you want to track some and not all repositories of an org, add <org_name>/<repo_name> as new lines for each public repo in repos-to-include.md.

Run The Scripts

$ python scripts/fetch_all_metrics.py

  • Reads all the repositories and orgs listed in repos-to-include.md
  • Requests GitHub GraphQL API
  • Creates one JSON file for each repository with format METRICS-YYYY-MM-DD.json
  • Saves the file inside _data/<owner>/<repo>/

$ python scripts/fetch_year_in_review.py

  • Hits aggregate_summary endpoint
  • Creates one JSON file that includes the metrics from the endpoint (watchers, stars, counts, merged PRs, committers, commits)
  • Saves the file inside _metadata/augur/

$ python scripts/gen_weekly_report.py

  • Iterates over every project listed inside _data
  • Picks the latest two Metrics which are atleast 6 days apart
  • Generates a Report based on these two Metrics files
  • Saves the json inside _data directory corresponding to each project, format WEEKLY-YYYY-MM-DD.json
  • Creates a _post for this report with some specific variables and the layout version

Additional Notes

  • GitHub Actions Config

    • Environment variables

      • OAUTH_TOKEN: Personal Access Token with repo access of a GitHub account.
      • GH_USERNAME: Username of the GitHub account.
  • Use Python 3.

  • _data contains all the data files

  • Files in _posts leverage _layouts and _data and generate HTML files

  • Don't change html files inside layouts. Create new layouts with new version.

  • Maintain versions of metrics layouts (See METRICS_VERSION inside the script to generate reports. Also create a new _layout for each metrics version). If you add more data, the new posts should be on a new version (which wouldn't break previous pages)

  • Use repos-to-include.md and repos-to-exlude.md files to add org/repository for respective purposes.

  • Prepend {{ site.url }}{{ site.baseurl }} and use relative URLs

    • e.g. {{ site.url }}{{ site.baseurl }}/css/main.css
  • Execute all the scripts from the home of the directory. e.g. python3 scripts/fetch_all_metrics.py

metrics's People

Contributors

decause avatar itsmesatwik avatar ivannpe avatar orkohunter avatar traviscibot avatar willnorris avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metrics's Issues

Reduce build time to speed up development

This project generates a lot of static web pages. Currently, the count is 12969 markdown files which are converted into HTML pages by Jekyll. It takes 246.02 seconds as of now. Every month, around 800 new pages are created. I believe the build time is reduced to some extent when it is developed on better machines.

My concern is, when someone is working on this project, they constantly regenerate the pages to find out the effects of the changes they are making. It used to take around 30 seconds last summer even with incremental build enabled. This is a huge drawback of Jekyll.

I have been experimenting with this very popular static page generator called Hugo, written in Go. It is very easy to install and use. And the reason why Jekyll should be replaced with Hugo for this project is that it is incredibly fast. Hugo claims that it takes <1ms per page to build. That would reduce the build time from ~250s to ~13 which is awesome!

The programming language is not a concern to us since we do not use anything ruby specific for Jekyll, nor we'll use anything Golang specific for Hugo. They have yaml based configs which work pretty neatly.

Interface with osshealth/Augur

Current API is hosted at - http://twitter.augurlabs.io/api/unstable/

  • = used in reports
  • = not used

API Status as of August 15, 2018

  • /:owner/:repo/timeseries/issue_comments
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not exist. Follow chaoss/augur#148

  • /:owner/:repo/timeseries/watchers
    • Status : OKAY
    • If used, on which pages? :
    • Comments: Gives timeseries of new watchers. Related chaoss/augur#152

  • /:owner/:repo/bus_factor
    • Status : OKAY
    • Used on twitter/metrics : YES
    • If used, on which pages? : WEEKLY - PROJECTS and MONTHLY - PROJECTS
    • Comments : Takes good enough time. Sometimes it gives a timeout

  • /:owner/:repo/timeseries/commits/comments
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Incomplete data

  • /:owner/:repo/timeseries/commits100
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Incomplete, outdated data

  • /:owner/:repo/committer_locations
    • Status : GOOD
    • If used, on which pages? :
    • Comments :

  • /:owner/:repo/timeseries/community_age
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not exist. Follow chaoss/augur#148

  • /:owner/:repo/timeseries/community_engagement
    • Status : GOOD
    • If used, on which pages? :
    • Comments :

  • /:owner/:repo/timeseries/contributions
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not Exist. Follow chaoss/augur#148

  • /:owner/:repo/dependencies
    • Status : OKAY
    • If used, on which pages? :
    • Comments : Gives good results for twitter/* repos. Internal Server Error for pantsbuild/pants, but good for twitter/pants

  • `/:owner/:repo/dependency_stats
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not return any result

  • /:owner/:repo/dependents
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not return any result

  • /:owner/:repo/timeseries/downloads
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : See chaoss/augur#149, does not return any results for most repos

  • /:owner/:repo/timeseries/fakes
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Not sure what the metric represents

  • /:owner/:repo/timeseries/issues/activity
    • Status : GOOD
    • If used, on which pages? :
    • Comments : It's there on currently github scraped metrics

  • /git/lines*changed/:git_repo*url
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Works only if repo is downloaded at the server - chaoss/augur#150

  • /:owner/:repo/linking_websites

  • /:owner/:repo/timeseries/tags/major
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Returns empty result for finagle even if it uses tags, possibly not useful

  • /:owner/:repo/timeseries/project_age
    • Status : NOT OKAY
    • Used on twitter/metrics :
    • If used, on which pages? :
    • Comments : Does not exist. Follow chaoss/augur#148

  • /:owner/:repo/timeseries/total_committers
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Looks useful!


  • /:owner/:repo/timeseries/issues/closed
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Looks good!

  • /:owner/:repo/timeseries/commits?group_by=:group_by
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Looks good!

  • /:owner/:repo/timeseries/code_review_iteration
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Gives no results for twitter projects.

  • /:owner/:repo/timeseries/contributing*github*organizations
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does Not exist

  • /:owner/:repo/timeseries/contribution_acceptance
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Don't know what "null" means, needs more information on the number

  • /:owner/:repo/timeseries/issues/response_time
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Contains lots of null values

  • /:owner/:repo/timeseries/forks?group_by=:group_by
    • Status : GOOD
    • If used, on which pages? :
    • Comments :

  • /:owner/:repo/issue_close_time
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not exist

  • /:owner/:repo/timeseries/lines_changed
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Maybe relates to chaoss/augur#150

  • /:owner/:repo/timeseries/pulls/maintainer_response_time
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Doesn't work on twitter projects, gives results for rails/rails

  • /:owner/:repo/pulls/new_contributing_github_organizations
    • Status : NOT OKAY
    • If used, on which pages? :
    • Comments : Does not exist

  • /:owner/:repo/timeseries/issues
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Looks good!

  • /:owner/:repo/timeseries/pulls/comments?group_by=:group_by
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Does not give any results for twitter projects

  • /:owner/:repo/timeseries/pulls
    • Status : GOOD
    • If used, on which pages? :
    • Comments : Result not ordered by date

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.