Code Monkey home page Code Monkey logo

Comments (2)

fedor57 avatar fedor57 commented on June 19, 2024

Just to let you know. I was involved once in one of the search giants in calculating kind of freshness PageRank over constantly changing web graph. The algorithm somehow accumulated weight diff and distributed it to peers when weight exceeded some threshold. Also there were some heuristics to intensify processing near new nodes with big weights.

Regarding convergence in incremental scenario: perhaps we can backup values from previous steps and update peers of the worker / task in case there is a big change in value with a flag "include in the next partial iteration". Then run some partial iterations with full ones every 5 partial. If believe that such a technic could produce a VERY fast dawid skene algorithm implementation. ;) Especially for the incremental scenario.

from fast-dawid-skene.

vbsinha avatar vbsinha commented on June 19, 2024

Hi,

One way to achieve the first two points would be to use an online algorithm.

  • One could use an online algorithm as described in the paper. The idea here would be to first estimate the true labels for all the questions that are available at the beginning. Then you could save the current state (class_marginals, error_rates, question_classes, counts). When you get a new batch of responses, you could load the saved variables, and do an EM pass on the new batch to estimate its correct labels. This would use the state learnt from previous batches and so will help for the current batch. You can then save the new set of parameters and repeat. This has not yet been implemented in this code.
  • To obtain the correct response of each question, we first calculate a term proportional to the probability (confidence) that a particular label would be correct and then choose the label which has the highest probability. So you can print question_classes[i, :] before this line to view the confidence that each label is correct for the ith question.

from fast-dawid-skene.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.