Code Monkey home page Code Monkey logo

primo's Introduction

PRIMO - PRobabilistic Inference MOdules

This project is a (partial) reimplementation of the original probabilistic inference modules (see branch primo-legacy). This reimplementation follows the same general idea, but restructured and unified the underlying datatypes to allow a more concise API and more efficient manipulation, e.g., by the inference algorithm. In turn the inference algorithms have been rewritten and partly extended. For most if not all use cases this implementation should be easier to use and more performant than the original.

primo's People

Contributors

hbuschme avatar jpoeppel avatar manuelbaum avatar maxkoch avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

jpoeppel hbuschme

primo's Issues

Add warnings when incorret CPTs are set

Currently we will silently allow invalid CPTs, i.e. we do not check if the probabilities sum up to 1.
It might be better to at least issue a warning in these cases, so that the user is aware that the results might be unexpected.

SlackTest

Just a quick test for the slack integration

Consider dropping networkx dependency

We hardly networkx. It is only used as an underlying graph representation for the networks, FactorTrees and it provides a useful helper for sampling.

Functionality we would need to implement in order to drop networkx:

  • Representing graph structure (potentially with variables for FactorTree) (used in network.py and inference/exact.py)
  • Topological sort for sampling (used in network.py)
  • Directed to undirected graph for FactorTree (used in inference/order.py and inference/exact.py)

Our own implementation could either be general (e.g. take required parts from networkx almost as they are, but simplified for our needs), or specialized and optimized for our use case (e.g. flag nodes without ancestors specially for topological sort)

Did I forget any other functions we use?

FactorTree: Implement get_evidence_probability

The marginals function already references the get_evidence_probability function, but that one is not implemented yet.
The function should return the probability of the specified evidence. This will most likely require us to store the currently set evidence in the factorTree which is not done so far.

Document DBN spec format

The DBN spec format is currently undocumented and ordering of transition pairs is slightly confusing.

Write documentation

While #18 is a good start, we should seriously consider writing a proper documentation for the entire package. There are a couple of examples, but I am not sure if they are sufficient...

Refractor Inference methods to not expose the Factor class

Returning Factors when one is expecting probabilities might be confusing (especially since the actual distribution needs to be queried with "get_potentials" not get_probability). Adding functions that imply that the factor contains probabilities is problematic since a factor cannot always know what kind of "probability" it represents (joint or conditional), so we cannot even normalize the potentials in order to get a valid probability in every case.

One solution might be to consider the Factor class as strictly internal for computations and change the interfaces of all inference methods to return different objects. One could consider returning low level np.arrays, or alternatively create a wrapper object, structurally similar to the current Factor class, but which exposes an interface implying actual probabilities and convenience functions.
Upside of this approach would be easier for naive users to understand the returned results, and the option to specialize factors more for computational efficiency.
Downside of this approach is the loss of flexibility with low level np.arrays or the complexity added by another class which needs to be created (should be ok since it is only done once per query at the end)

Decide on how we handle docstrings in subclasses

PR #5 made it apparent that we do not currently have decided on how we want to handle docstrings in subclasses (especially from abstract classes). More or less copying does not appear to be reasonable as it only makes it easier to have inconsistent docstrings when we need to change something.
Just referencing the superclass in the docstring of the subclass has the problem of not being introspection friendly to IDEs.
Furthermore, in the case where additional parameters are possible, it is not clear where to put their description.
We should decide on one approach to use throughout the codebase.

Options that I see currently:

  1. Superclass holds precise docstrings, entailing information about all parameters and what the function is supposed to do. The subclasses would then reference that docstring while making clear if they implement an extension or override the original function, as well as any special behaviour that differs from what was specified in the superclass
  2. Superclasses (especially abstract ones) will only contain minimal docstrings, stating the general idea of the functions and maybe the types of the parameters. Only the subclasses provide the concrete descriptions (will most likely not work well for non-abstract superclasses as they should have proper docstrings on their own).
  3. Both super- and subclasses have full docstrings explaining everything relevant to their current implementation. This requires a lot of duplication of docstrings though.

I guess I would currently favour option 1. Maybe while still providing information about the parameters in the subclasses to aid introspection in IDEs.

Unify variable naming in the codebase

Currently some variables use camelCase, while others use the PEP8 conform underscore separation.
We should decide for one and use it consistently throughout the codebase.

Should we just follow PEP8 on this?

Integrate support for Bayesian Decision Networks

Either migrate the implementation from the primo-legacy branch or integrate the results of M. Holland's Master's thesis (“Dynamische Decision Netzwerke für kooperative Mensch-Maschine-Interaktion”), which was implemented as a part of PRIMO and is therefore LGPL-v3 licensed (by infection).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.