Code Monkey home page Code Monkey logo

interleaving's People

Contributors

jkawamoto avatar tmanabe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

interleaving's Issues

Probabilistic._compute_scores IS WRONG

I think the original PM paper is misleading.

When we compute P(a|l,q) by using P(l|a,q), it is necessary to use P(l|q) for normalization.
More precisely,

P(a|l,q) = P(l|a,q)P(a|q)/P(l|q)

Since P(l|q) = sum_a(P(l|a,q)P(a|q)) and P(a|q) = 1/(# rankers), the correct equation should be

P(a|l,q) = P(l|a,q)/sum_a(P(l|a, q))

Therefore, Algorithm 2 in the original paper must normalize p' before computing the expected outcome.

SystemError: Parent module '' not loaded

>python ./tests/test_balanced.py
Traceback (most recent call last):
  File "./tests/test_balanced.py", line 4, in <module>
    from .test_methods import TestMethods
SystemError: Parent module '' not loaded, cannot perform relative import

Random team selection at Probablistic.multileave

I wonder whether the current implementation of the random team selection at Probablistic.multileave is correct.

The current code randomizes the order of the team list, use all of them in order, and randomizes it again if all the teams are used.

In "Probabilistic Multileave for Online Retrieval Evaluation, SIGIR 2015", on the other hand, Line 6 in Algorithm 1 may suggest that it draws one team from the team list every time.

How do you think?

Prefix constraint sampling

In [Schuth+, CIKM 2014], the return value L of the prefix constraint sampling is a set.
Therefore, each prefix should be distinct. However, the current implementation of
Optimized::_sample_rankings (inherited from InterleavingMethod) seems to allow to
generate one prefix multiple times. Don't we have to modify the method?

Use `dict.fromkeys` instead of `set` for deterministic behavior

Some methods use sets to ensure the uniqueness of items and rely on the order of items of the set. Python's set order is not deterministic across python interpreters, which makes it hard to control, especially during tests as there is no seed for set that can be controlled.

Using dict.fromkeys instead of set achieves the same uniqueness but guaranteeing deterministic ordering of items.

The following methods are affected:

  • PairwisePreference._current_candidates
  • Optimized._sample_rankings
  • Optimized._sample

Simulation

Implement the simulation with LETOR data.

Proposal for revising MultileavingMethod

The current implementation generates an interleaved ranking when two rankings are given.
This may not be convenient when

  1. The method requires much time and can save it by caching, and
  2. The distribution of rankings is necessary (e.g. Store the results in a file and use them for some purposes).

A proposal here is to obtain two or more ranking through init, compute the distribution in the initialization, and return a ranking if the interleave method is called (without any argument).
Moreover, we can return a dict or a list of tuples that includes all the possible rankings with their probability.

If computing the probability or generating all the possible rankings is not feasible, we can just receive two or more rankings in the initialization, and generate a ranking when the interleave method is called.

A tiny bugfix

I forgot to remove a call to print from the test code for Probabilistic.

setup.py does not install the simulation subpackage

>python setup.py build
...
>python setup.py install
...
>python ./tests/test_methods.py
Traceback (most recent call last):
  File "./tests/test_methods.py", line 1, in <module>
    import interleaving as il
  File "C:\Users\owner\AppData\Local\Programs\Python\Python35\lib\site-packages\interleaving-0.0.1-py3.5.egg\interleaving\__init__.py", line 6, in <module>
ImportError: cannot import name 'simulation'

The choice of rankers in Probabilistic Interleaving and Multileaving

It seems that the choice of rankers is different at Probabilistic Interleaving and Multileaving.

In Probabilistic Interleaving, the authors said:

After constructing s1 and s2, l is generated similarly to the team draft method. However, instead of randomizing the ranker to contribute the next document per pair, one of the softmax functions is randomly selected at each rank. Doing so is mathematically convenient, as the only component that changes at each rank is the distribution over documents.

"one of the softmax functions is randomly selected at each rank" may mean that it randomly selects a ranker every time it decides a document to be put in the combined ranking.

In Probabilistic Multileaving, on the other hand, it randomly picks up one of the rankers without replacement, as you explained.

The conclusion might be one of the following:

  1. my understanding is wrong: they use the same method for selecting a ranker
  2. Probabilistic Interleaving and Multileaving use different strategies for selecting a ranker; so we need two types of implementation.
  3. Probabilistic Interleaving and Multileaving use different strategies for selecting a ranker; yet we should select either for simplicity.

Dumping Ranking instances

I have been thinking about how to pass the interleaving results to a production environment.

Probably, an instance of InterleavingMethod can have function dump_rankings(self, file), which dumps sampled rankings into a file. Rankings, their probability, and some additional information (e.g. teams or credits) must be stored in the dump file. It is probably convenient to assign a unique ID for each ranking by using hash.

The format of the dump file should be something readable by the other programs, such as json or xml.

A production system can load the dump file, show the rankings to users, observe user clicks, and record the number of impressions and rank of clicks for each ranking.

InterleavingMethod.load_rankings(cls, file) can then load the rankings from the dump file and somehow obtain the number of impressions and rank of clicks. InterleavingMethod.evaluate(cls, ranking, clicks) can be used for deciding the ranking preference based on the loaded information.

In summary, my proposal is to implement:

  1. InterleavingMethod.dump_rankings(self, file) for dumping sampled rankings
  2. InterleavingMethod.load_rankings(cls, file) for loading the dumped rankings
  3. Ranking.dumps(self) and Ranking.loads(self, s)
  4. The format of the dump file can be
{ 
    ranking_id: {
        'probability': float,
        'ranking': {
            'ranking_list': list,
            'teams': { ranker_index: list }
        }
    }
}

Needs a fix for Ranking.__hash__ from TeamDraft

Suppose A = [1, 2, 3] and B = [2, 3, 1]. Two rankings are obtained by TeamDraft: [1, 2, 3] and [2, 1, 3]. However, there must be four types of rankings, i.e.

  • [1, 2, 3] with team A = [1, 3] and team B = [2],
  • [1, 2, 3] with team A = [1] and team B = [2, 3],
  • [2, 1, 3] with team A = [1, 3] and team B = [2], and
  • [2, 1, 3] with team A = [1] and team B = [2, 3].

The current implementation of Ranking.hash only takes into account of the equality of the ranking but not that of the team.

Deprecation Warning

/Users/kato/.local/share/virtualenvs/interleaving-M68ZqYhg/lib/python3.6/site-packages/pulp/solvers.py:71: DeprecationWarning: The SafeConfigParser class has been renamed to ConfigParser in Python 3.2. This alias will be removed in future versions. Use ConfigParser directly instead.
  'os':operating_system, 'arch':arch})

/Users/kato/dev/interleaving/interleaving/probabilistic.py:215: DeprecationWarning: `logsumexp` is deprecated!
Importing `logsumexp` from scipy.misc is deprecated in scipy 1.0.0. Use `scipy.special.logsumexp` instead.
  p_all = np.exp(p_log_all - misc.logsumexp(p_log_all))

There are two messages about deprecation.

Pairwise Preference Multileaving

Harrie Oosterhuis, and Maarten de Rijke. "Sensitive and Scalable Online Evaluation with Theoretical Guarantees." Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 2017.

Optimized's infinite loop

The current Optimized._sample_ranking samples rankings until the number of sampled rankings becomes sample_num.
However, this sometimes never ends: the number of possible rankings can be less than sample_num.

So it must be "until rankings are sampled sample_num times", not "until the number of sampled rankings reaches sample_num".

Too many samplings

Some tests conduct too many samplings, which makes the tests quite slow.

Output of InterleavingMethod.evaluate

Is it OK if the output of InterleavingMethod.evaluate is changed to a list where each element (i,j) indicates i won j?

Current:

Return one of the following tuples:
- (1, 0): Ranking 'a' won
- (0, 1): Ranking 'b' won
- (0, 0): Tie

Expected:

Return a list of pairs of ranker indices in which element (i, j) indicates i won j.
e.g. a result [(1, 0), (2, 1), (2, 0)]  indicates ranker 1 won ranker 0, and ranker 2 won rankers 0 as well as 1.

Discussion on the arguments in InterleavingMethod

Sorry for trying to change the arguments in InterleavingMethod again.

The current arguments of InterleavingMethod are:

def __init__(self, max_length, sample_num, *lists)

This could be improved since keyword arguments with default values cannot be used with a variable-length argument.

There are some options:

def __init__(self, max_length, lists, sample_num=None, some_keyword_argument=3)

(lists is no longer a variable-length argument)

def __init__(self, max_length, *lists, **kwargs)

(kwargs is a dict for keyword arguments)
e.g.

im = InterleavingMethod(10, [1, 2], [3, 4], hoge=3)
# kwargs = {"hoge": 3}
# If "foo" is missing in this dict, we may use a default value for "foo" by explicitly doing so:
if not "foo" in kwargs:
    kwargs["foo"] = FOO_DEFAULT_VALUE

Which do you prefer to?

pypi package

This library is great. Would it be possible to make a pypi package out of it?

If you need help in doing so, I'd love to help out.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.