elasticsearch-river-github's Introduction

elasticsearch-river-github

Elasticsearch river for GitHub data. Fetches all of the following for a given GitHub repo:

Works for private repos as well if you provide authentication.

##Easy install

Assuming you have elasticsearch's bin folder in your PATH:

plugin -i com.ubervu/elasticsearch-river-github/1.6.3

Otherwise, you have to find the directory yourself. It should be /usr/share/elasticsearch/bin on Ubuntu.

##Adding the river

curl -XPUT localhost:9200/_river/gh_river/_meta -d '{
    "type": "github",
    "github": {
        "owner": "gabrielfalcao",
        "repository": "lettuce",
        "interval": 3600,
        "authentication": {
            "username": "MYUSER", # or token
            "password": "MYPASSWORD" # or x-oauth-basic when using a token
        }
    }
}'

Interval is given in seconds and it changes how often the river looks for new data.

The authentication bit is optional. It helps with the API rate limit and when accessing private data. You can use your own GitHub credentials or a token. When using a token, fill in the token as the username and x-oauth-basic as the password, as the docs mention.

##Deleting the river

curl -XDELETE localhost:9200/_river/gh_river

##Indexes and types

The data will be stored in an index of format "%s&%s" % (owner, repo), i.e. gabrielfalcao&lettuce.

For every API event type, there will be an elasticsearch type of the same name - i.e. ForkEvent.

Issue data will be stored with the IssueData type. Pull request data will be stored with the PullRequestData type. Milestone data will be stored with the MilestoneData type.

Recommend Projects

chinna1986 / elasticsearch-river-github Goto Github PK