Code Monkey home page Code Monkey logo

githut's Introduction

GitHut

GitHut (http://githut.info) is an attempt to visualize and explore the complexity of the universe of programming languages used across the repositories hosted on GitHub.

Programming languages are not simply the tool developers use to create programs or express algorithms but also instruments to code and decode creativity. By observing the history of languages we can enjoy the quest of humankind for a better way to solve problems, to facilitate collaboration between people and to reuse the effort of others.

Github is the largest code host in the world, with 3.5 million users. It's the place where the open-source development community offers access to most of its projects. By analyzing how languages are used in GitHub it is possible to understand the popularity of programming languages among developers and also to discover the unique characteristics of each language.

The visualization is based on two type of visualization: a Parallel Coordinates chart and a Small Multiples visualization.

Data is from Github Archive (http://www.githubarchive.org/).

Web Site

GitHut is published at http://githut.info

Queries

GitHub Archive data is also available on Google BigQuery. Below are the two queries used to collect the data for the Parallel Coordinates and Small Multiples visualizations:

Parallel Coordinates

Multiple information grouped by language for a defined quarter

SELECT 
  repository_language,
  type,
  COUNT(distinct(repository_url)) AS active_repos_by_url,
  COUNT(repository_language) AS events,
  YEAR(created_at) AS year,
  QUARTER(created_at) AS quarter
FROM [githubarchive:github.timeline]
WHERE
    (
      type = 'PushEvent'
      OR type = 'ForkEvent'
      OR (type = 'IssuesEvent' AND (payload_action="opened" OR payload_action=="reopened"))
      OR (type = 'CreateEvent' AND payload_ref_type="repository")
      OR type = 'WatchEvent'
    )
    AND repository_language !=''
    AND repository_url != ''
    AND YEAR(created_at)= 2014
    AND QUARTER(created_at)=1
GROUP BY 
  repository_language,
  type,
  year,
  quarter

Small Multiples

Count of active repositories by quarter

SELECT
  repository_language,
  COUNT(distinct(repository_url)) AS active_repos_by_url,
  YEAR(created_at) AS year,
  QUARTER(created_at) AS quarter,
FROM [githubarchive:github.timeline]
WHERE
    type="PushEvent"
GROUP BY
  repository_language,
  year,
  quarter
ORDER BY
  repository_language,
  year DESC,
  quarter DESC

License

The content of this project itself is licensed under the Creative Commons Attribution 4.0 license, and the underlying source code used to format and display that content is licensed under the MIT license.

githut's People

Contributors

akzhan avatar climbsrocks avatar littleark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

githut's Issues

[Improvement] Dark Theme

Improvement idea: Add a dark theme so late night programmers / lurkers don't burn their retinas.

I think bootstrap has possibility for simple toggle switch CSS swap between dark and light themes.

Quite a few sites are adopting this now. Even MSDN (Microsoft Developer Network).

Contributors per repository

Thanks for the interesting visualization!
I think that that average number of contributors per repository could be an interesting metric to add. New Forks per Repository is similar, but I also find some users fork repositories of interest and not as a means of actually contributing to the repository.

compare any language to popular languages

Currently, only popular languages are included in the analysis. I would really like to see, how some new/less used languages compare to the popular ones. Maybe you could add a dropdown menu that allows to choose one arbitrary language (out of all languages recognized by github) to be added to the diagrams.

Rust please?

Rust is gaining a lot of popularity recently and I was going to check the stats for rust but githut didn't have that :(

Make appeared in 1977, not 1970

GitHut reports the Makefile language appeared in 1970. According to the Make Wikipedia article, the first release of the make utility was in 1977.

Logic dicates that the Makefile language appears after a shell language, since Makefiles depend upon a shell language.

Rewrite githut to use Google Big Query as a data source.

I just saw this tweet and was amazed that this data hasn't been updated since 2014.

I do realize after reading the other issues that it's something that changed with Github that made it impossible to generate data post December 2014.

What if we update this repo to utilize Googles BigQuery and their Github dataset?

Here is a Medium article with all the cool things people already done.

Might be a little bit troublesome to get historical data, but if you start building our own history now, you'll be able to have nice graphs in the future. I rather see up to date data than old historical data. :)

Let me know what you all think.

[REQUEST] Growth metrics

Love the visualisation, but I think showing trends and being able to rank on trends would be very interesting too, especially for identifying up and coming languages instead of mainly seeing the juggernauts.

Even something like a percentage difference over the previous month and being able to sort on that would be really valuable I think.

Any ideas about whether that's something you'd be interested in having in GitHut, and any high level ideas about how you'd achieve it so someone in the community might be able to take it up?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.