Code Monkey home page Code Monkey logo

openskill.py's Introduction

Open Debates

Open Source Semantic Wiki and Debate Platform

openskill.py's People

Contributors

allcontributors[bot] avatar bstummer avatar calcolson avatar dependabot[bot] avatar erotemic avatar github-actions[bot] avatar jack-mcivor avatar philihp avatar stephenbartos avatar vivekjoshy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

openskill.py's Issues

predict function

hello,

congrats for your work! I was wondering if there is a predict function for the rankings?

greetings

mu=0 results in mu=25

Describe the bug
mu=0 results in mu=25. Same goes for sigma and potentially other optional parameters that I did not investigate.

To Reproduce
To reproduce simply do:

model = PlackettLuce()
player = model.rating(mu=0,sigma=1)
print(player.mu) # prints 25.0 but expected is 0.0

Expected behavior
When mu is not None then take whatever the user provides.

Screenshots
image

Additional context
This can lead to unexpected behaviour AND wrong predictions. The issue happens in the wrong initialization of Rating objects.

# Replace this:
return self.PlackettLuceRating(mu or self.mu, sigma or self.sigma, name)
# With something more like this:
if self.mu is None:
    return self.PlackettLuceRating(mu,...)
else:
    return self.PlackettLuceRating(self.mu, ...)
# and the same for other parameters

Software Paper Review: Suggestions for Clarity and Completeness

Please consider the following in drafting the software manuscript:

  • Provide more contextual information about the problem you are solving in the paper and software, targeting software engineers and researchers who may not have specialized knowledge in the domain.
  • There is a typo in the summary: know --> known.
  • For claims about the software being "Faster" and "accurate", please offer supporting evidence like benchmarks, examples, or descriptions of the steps taken to achieve these qualities.
  • Some of the limitations of the software are mentioned in the FAQ section of the documentation. It would be beneficial to dedicate a discussion section in the paper that covers what the package can and cannot do, as well as future development plans.
  • The paper should give appropriate credit to the OpenSkills.js package.
  • Please incorporate at least one example in the paper that demonstrates how to use the package, enabling readers to start using it quickly.

This issue is related to this submission: openjournals/joss-reviews#5901

Clearer statement of need in documentation

Raising as part of JOSS review openjournals/joss-reviews#5901

The documentation / top-level README need a clearer statement of need / what problems the software is designed to solve and who the intended audience is.

The current summary at the begining of the README

A faster and open license asymmetric multi-team, multiplayer rating system comparable to TrueSkill.

assumes knowledge of what a multiplayer rating system is and what TrueSkill is. There is also no specific mention of online gaming communities, which reading between the lines, seems to be the primary target audience.

The summary in the documentation index page

This advanced rating system is faster, accurate, and flexible with an open license. It supports multiple teams, factions, and predicts outcomes. Elevate your gaming experience with OpenSkill - the superior alternative to TrueSkill.

similarly needs a bit more context and explanation.

Possibility for parameter for how ratings and win chances adjust for uneven teams

I'm using openskill for a game where sometimes we have teams of, for example, 6 vs 7. When making teams we put the better players on the team with the lesser amount of players. Openskill estimates are way off results when dealing with uneven teams. It seems that it values extra players much more than the specific game I'm using for it does.

Does anyone have any insight on how to tune a parameter that makes team disparity less important?

Thanks!

Tournament Interface

Is your feature request related to a problem? Please describe.
Creating models of tournaments is hard since you have to parse the data using another library (depending on the format) and then pass everything into rate and predict manually. It's a lot of effort to predict the entire outcome of say, "2022 FIFA World Cup" easily.

Describe the solution you'd like
it would be nice if there was a tournament class of some kind that allowed us to pass in rounds which themselves contained matches. Then using an exhaustive approach predict winners and move them along each bracket/round. Especially now that #74 has landed it would be easier to predict whole matches and in turn tournaments.

The classes should be customizable to allow our own logic. For instance, allow using the munkres algorithm and other such methods.

Describe alternatives you've considered
I don't know any other libraries that do this already.

Improve win predictions for 1v1 teams

First of all, congrats and thanks for the great repo!

In a scenario that Player A has 2x the rating of Player B, the predicted win probability is 60% vs 40%. This seems strange.

players = [ [Rating(50)], [Rating(25)] ]

predict_win(teams=players)

[ 1 ]: [0.6002914159316424, 0.39970858406835763]

If I use this function implementation, I get 97% vs 3% which sounds more reasonable to me.

Maybe the predict_win function has some flaw?

predict_win and predict_rank do not work on 2x2 and more games

Describe the bug
predict_win and predict_rank do not work properly on 3x3x3 games

To Reproduce
Step 1:

from openskill.models import PlackettLuce

model = PlackettLuce()

p1 = model.rating(mu=34, sigma=0.25)
p2 = model.rating(mu=34, sigma=0.25)
p3 = model.rating(mu=34, sigma=0.25)

p4 = model.rating(mu=32, sigma=0.5)
p5 = model.rating(mu=32, sigma=0.5)
p6 = model.rating(mu=32, sigma=0.5)

p7 = model.rating(mu=30, sigma=1)
p8 = model.rating(mu=30, sigma=1)
p9 = model.rating(mu=30, sigma=1)

team1, team2, team3 = [p1, p2, p3], [p4, p5, p6], [p7, p8, p9]

r = model.predict_win([team1, team2, team3])
print(r)

Results in:
[0.439077174955099, 0.3330210112526078, 0.2279018137922932]

Step 2, change p9 mu to 40:

from openskill.models import PlackettLuce

model = PlackettLuce()

p1 = model.rating(mu=34, sigma=0.25)
p2 = model.rating(mu=34, sigma=0.25)
p3 = model.rating(mu=34, sigma=0.25)

p4 = model.rating(mu=32, sigma=0.5)
p5 = model.rating(mu=32, sigma=0.5)
p6 = model.rating(mu=32, sigma=0.5)

p7 = model.rating(mu=30, sigma=1)
p8 = model.rating(mu=30, sigma=1)
p9 = model.rating(mu=40, sigma=1)

team1, team2, team3 = [p1, p2, p3], [p4, p5, p6], [p7, p8, p9]

print([team1, team2, team3])
r = model.predict_win([team1, team2, team3])
print(r)

Results are the same:
[0.439077174955099, 0.3330210112526078, 0.2279018137922932]

Expected behavior
After p9 mu increase team3 are expected to have a bigger chance of victory

Platform Information

  • openskill.py Version: 5.1.0

Additional context

I have no idea what is going on here, and why it selects rating only of the first player, but it just does not work as intended

Fully Vectorized

This is obviously a very difficult problem that relies on a few parts being successful.

No: Dependency Changes Strict Typing Implementation OS Performance Gains Implementation Difficulty
1 None Possible CPython, PyPy Windows, Ubuntu, MacOS Insignificant Easy
2 Numpy Partial CPython Windows, Ubuntu, MacOS Significant Difficult
3 Scipy Not Possible Cpython Windows, Ubuntu, MacOS Significant Normal
4 Conditional Numpy Partial CPyhton, PyPy Windows, Ubuntu, MacOS Significant Very Difficult

Option 4 is ideal for best compatibility and performance but is a huge undertaking at the end of which strict typing may still end up being not possible.

Regardless of which option is being pursued, these tasks need to be completed first:

Documenting how to access data for benchmarking

Raising as part of JOSS review openjournals/joss-reviews/issues/5901

As the data files are stored on Git LFS and the free LFS quota for this account seems to be regularly exceeded (see openjournals/joss-reviews#5901 (comment)) it would be useful to document an alternative approach for accessing the data, ideally one which uses an open data repository which doesn't require subscribing to an account to download. While the datasets have been made available on Kaggle (openjournals/joss-reviews#5901 (comment)) this is not currently documented in this repository and a Kaggle account is required to download. An open research data repository / archive like Zenodo would seem to be a better fit with JOSS requirement that the software should be stored in a repository that can be cloned without registration. While I don't think this strictly extends to data associated with the software, from a FAIR data and reproducibility perspective a service like Zenodo is much better than Kaggle.

A potentially even nicer approach would be to use a tool like pooch to automate getting the data from a remote repository as part of running the benchmarks.

Rate function: "score" and "rank" interchangeable ?

Apologies for being a noob - it seems that the score margin doesn't have any affect on how the ratings are updated and it's effectively the same as the rank option just that the higher score is better. If that is true, is there a way to consider the score margin for games where it is important?

Model Agnostic API

The Rating objects currently can be mixed and used between models. This may or may not make sense depending on the models under consideration. It is definitely erring on the side of caution to disallow this. It allows us to have different values (instead of mu and sigma for different models (perhaps Glicko? Standard Elo?).

Proposed API:

Note: Added a spoiler to not cause bias from my recommendation.

New API (Click to Reveal)
Similar to TrueSkill API, we can have a new class called `OpenSkill` which is initialized with the default `PlackettLuce` model.

Example code:

from openskill.models import BradleyTerryFull


# Initialize Rating System
system = BradleyTerryFull(tau=0.3,  constrain_sigma=True)

# Team 1
a1 = system.Rating()
a2 = system.Rating(mu=32.444, sigma=5.123)

# Team 2
b1 = system.Rating(43.381, 2.421)
b2 = system.Rating(mu=25.188, sigma=6.211)

# Rate with BradleyTerryFull
[[x1, x2], [y1, y2]] = system.rate([[a1, a2], [b1, b2]])  # No need to pass tau and so on again.

All functions that can be, will be converted to methods under the. All constants in the methods can be manually overridden as normal. A variable called custom will be set to True if models are mixed or constants are changed within a system after ratings have taken place.

Rating objects will contain a public attribute (Rating.model that references the model with which they were created in mind. So, if the user tries to use it in any function that takes those objects, it will produce an error.


If there are no active objections from users or any other implementation developers by the time I get to this issue in the Project Release Board (which should be a while still), then it will be shipped in the next major release.

If someone has another API idea, you are also free to suggest it in this issue.

Mentions: @philihp

Relevant Issues: philihp/openskill.js#231

Automatic Test Generation and Parameterization

Is your feature request related to a problem? Please describe.
When a model is rewritten or improved, due to changes internally, the expected API outputs will change significantly. Re-entering the correct values into the test suite to verify determinism is wasted effort on the developer's part long term.

Describe the solution you'd like
Use Hypothesis to generate tests and pytest parameterization.

Tasks:

  • Decouple benchmarks into its own top-level package for loading different kinds of data.
  • Create a command line utility to run benchmarks and regenerate tests.
  • Import benchmarks package to load data for testing purposes.

let score difference be reflected in rating

When you enter scores into rate(), the difference between the scores have no effect on the rating - meaning: rate([team1,team2],score(1,0)) == rate([team1,team2],score(100,0)) is true.
They have exactly the same rating effect on team1 and team2.

I don't know if it is mathematical possible and how it would look like. But it would be great if the difference could be somehow factored into the calculation, as it is (if your game has a score) quite an important datapoint for skill evaluation.

Issue: mu and sigma can't be set to zero

Describe the bug
Player rating parameters mu and sigma cant be set to 0.0, they are overwritten by default values 25 and 8.333.
The issue is file openskill/rate.py on rows 28, 29:

self.mu = mu if mu else default_mu(**options)
self.sigma = sigma if sigma else default_sigma(**options)

Conditions mu if mu and sigma if sigma return False when mu or sigma is set to 0.
Also, one cosmetic thing, you have the wrong typing in openskill/constants.py for functions z and mu. When the default z or mu value is used you are returning 3 or 25 (int) instead of float.

To Reproduce

from openskill import Rating

Rating(mu=0.0, sigma=5)
Rating(mu=25, sigma=0.0)
Rating(mu=0.0, sigma=0.0)

Expected behavior
It should be possible to set them to a value of 0.0.

Screenshots
image

Possible solution

if isinstance(mu, float) or isinstance(mu, int):
    self.mu = mu
else:
    self.mu = default_mu(**options)

Platform Information

  • OS: [macOS]
  • Python Version: [3.8.13]
  • openskill.py Version: [2.5.0]

Additional context

Unable to install on Google Colab

Describe the bug
I can't install openskill on Google Colab via pip.
What should I do?

Screenshots
スクリーンショット 2022-03-31 23 14 18

Platform Information

  • Google Colab
  • Python Version: 3.7.13

Improve Documentation

  • Make the statistical theory accessible for absolute beginners.
  • LaTeX where possible.
  • Docstrings Everywhere.

"What do you mean, 'everywhere'?"

Everywhere

Community guidelines for reporting issues and support queries

Raising as part of JOSS review openjournals/joss-reviews/issues/5901

Ideally you should some clear and easily findable guidelines for how to report issues and seek support with the software.

This section in the user manual page in the documentation

If you're struggling with any of the concepts, please search the discussions section to see if your question has already been answered.
If you can't find an answer, please open a new discussion and we'll try to help you out.
You can also get help from the official `Discord Server <https://discord.com/invite/4JNDeHMYkM>`_.

already partially fits the bill, but

  1. It took a little while for me to find - I would put it somewhere more prominent, for example in a dedicated top-level documentation page or in README
  2. The reference to 'discussions section' is not very clear - I assume this means GitHub Discussions but from documentation website it wouldn't necessarily be clear how to get there so adding a link would be useful.
  3. A brief description and pointer to the GitHub issue tracker as (presumably) the correct place to report issues with the code, and perhaps some explanation of the different issue templates / categories.

Are `predict_win` and `predict_draw` functions accidentally using Thurstone-Mosteller specific calculations?

If I understand it correctly, those two functions seem to perform calculations using equations numbered (65) in the paper. However, those equations seems to be specific to Thurstone-Mosteller model and as far as I can tell, the proper way to calculate probabilities for Bradley-Terry model would be to use equations (48) and (51) (also seen as p_iq in equation (49)). Is this intended? Or am I misunderstanding either the paper or the code of these functions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.