Code Monkey home page Code Monkey logo

ros_gh's People

Contributors

dependabot[bot] avatar elbraulio avatar pestefo avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

pestefo

ros_gh's Issues

device rank

with small doubles the rank returns NaN, so it should be replaced as 0.

Double.isNaN(rank) ? 0d : rank;

also Aspirant is returning wrong ranking, must be

this.ka*0.75 + this.da*0.25;

RuuKA math error

the correct Math is

    private double vectorSpace(int i, int j, double[][] rut) {
        double sum = 0d;
        double length = 0d;
        for (int k = 0; k < rut[i].length; k++) {
            sum += rut[i][k] * rut[j][k];
            length += rut[i][k];
        }
        return sum/length;
    }

devrec implementations as example

we want to replicate Devrec described in this paper. It will be implemented on the examples branch using tools committed on master branch. There is an important difference between Zhang et al. implementation and ours, it is that we are looking for someone to answers a question instead of participating on a project.

All the following quotes were extracted from the original paper.

Data extraction

  • we use this data already extracted with ros_gh and available here.

Developer Recommendation Based on Social Coding Activities

  • UP Connector: This part is to create the association matrix of users and projects based on the activities in GitHub. Here we get a two-value matrix Ru−p, where 1 stands for participation and 0 stands for the opposite.

  • User Connector: This part is to calculate the association between users based on the user project association matrix using Jaccard algorithm.

  • Match Engine: In this part, we calculate the association between users and projects according to the user association matrix Ru−u. If we use UAp⟨u1,u2,...,un⟩ to represent users that have already participated in the target project p, we can obtain the match score of each user towards project p using:

captura de pantalla 2018-10-27 a la s 13 51 44

Developer Recommendation Based on Knowledge Sharing Activities

  • Relation Creator: In this part, we calculate the user tag association matrix. Here we use TF-IDF method. If we use U{u1,u2,...,un} to represent users in StackOverflow, Tu = {t1,t2,...,tn} to represent the tags that related to user u, and C(t,u) to represent the number of times tag t relates to user u. Then we can calculate user tag association matrix using

captura de pantalla 2018-10-29 a la s 18 23 32

  • User Connector: After obtaining the user tag association matrix Ru−t, we calculate the association of users using Vector Space Similarity algorithm.

  • Match Engine: The same as the match engine part in DA-based approach.

accuracy light

check the first aspirant who math at least the half of tags from question.

Also rename DefaultAccuracy to StricAcuracy

wrong tag counting in devrec

the result of querying how many times a given tag is related to all users has different results if it is queried for each user using FetchTagCount compared with using a single query:

select sum(count) as count
from ros_user_tag
where ros_user_tag.ros_user_id in (select ros_user_id from linked_users) and
      ros_user_tag.ros_tag_id = ?;

Some tag names in 'ros_tag' table are cut

There are 1277 tags that looks cut, e.g. :

id name
5 turtlebot_dash...
9 turtlebot_cali...
206 message_genera...
263 installation_e...
288 camera_calibra...
305 sicktoolbox_wr...
306 xv_11_laser_dr...
348 trajectory_fil...

You can get the complete list with this query:

select *
from ros_tag	
where name like '%...'

release 0.1-beta.1

fix bugs and devrec implementation.

  • create issue release.
  • create new branch
  • update pom version.
  • make a pull request.
  • close milestone when pr is acepted.
  • make a release on github with pom version.
  • check Travis and JitPack build results.

examples must be excluded from the project scope

it is not suitable to have examples within the project because they have not have unit test and increments the project complexity. These tools must be provided by the project but without containing examples inside ignored test. This packages must be excluded from the main project and can be included on a examples branch:

  • launcher

  • GithubInfoTest

  • ignored test from FetchUsersPageListTest

  • FetchAnswersTest

  • IteratePagedContentTest

  • ParticipantsTest

Logs for data extraction

Like builds, all we want to know how the extract process made success or failed. It is important to have logs in order to identify errors and possible missing data cases.

maven example on Readme is wrong

it should be

<dependencies>
    <dependency>
        <groupId>com.elbraulio</groupId>
        <artifactId>ros_gh</artifactId>
        <version>{version}</version>
    </dependency>
</dependencies>

<repositories>
	<repository>
	    <id>jitpack.io</id>
	    <url>https://jitpack.io</url>
	</repository>
</repositories>

move Devrec example into artifact domain

currently Devrec is out of artifact domain com.elbraulio.rosgh. That is confusing when you import this as

import examples.Devrec

when it should be

import com.elbraulio.rosgh.example.Devrec

Is it possible execute the scrapper to get an updated sample and also more data from the ROS User (e.g. karma, last_seemt_at, etc.) ?

I need more data from the ROS Answers' user, specifically:

  1. karma

  2. joined_at

  3. last_seen_at

  4. location

  5. has_avatar (or the url to the avatar and NULL if it has the default)

  6. description

  7. real name (if it exist)

  8. age

  9. badges (and its count)

I'm particularly interested in the first 4, so if it's it require more effort to get the rest of the list I'd already be happy with havine just those four (karma, joined_at, last_seen_at, location).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.