Code Monkey home page Code Monkey logo

pcit's Introduction

Predictive Conditional Independence Testing (PCIT)

with applications in graphical model structure learning

Currently under review! See issues

Description

This package implements a multivariate conditional independence test and an algorithm for learning directed graphs from data based on the PCIT

arXiv preprint

Developers

If you like to contribute, read our contribution guide.

Code Example

For details, see the Examples or the Manual

There are 3 main functions:

  • MetaEstimator: Estimator class used for independence testing
  • PCIT: Multivariate Conditional Independence Test
  • find_neighbours: Undirected graph skeleton learning algorithm

For the following, X, Y and Z can be univariate or multivariate

from sklearn.datasets import load_boston
data = load_boston()['data']
X = data[:,1:2]
Y = data[:,2:4]
Z = data[:,4:10]
Testing if X is independent of Y on a 0.01 confidence level
PCIT(X, Y, confidence = 0.01)

The direction of the prediction is X -> Y, and as such the p-values correspond to the hypothesis that adding X does not improve the prediction of Y (one for each dimension in Y). If the parameter 'symmetric' is set to True (default), both directions are tested.

Testing if X is independent Y, conditional on Z
PCIT(X, Y, z = Z)
Testing if X is independent of Y, conditional on Z, using a custom MetaEstimator, multiplexing over a manually chosen set of estimators:
from sklearn.linear_model import RidgeCV, LassoCV,
                    SGDClassifier, LogisticRegression

regressors = [RidgeCV(), LassoCV()]
classifiers = [SGDClassifier(), LogisticRegression()]
custom_estim = MetaEstimator(method = 'multiplexing',
                estimators = (regressors, classifiers))

PCIT(X, Y, z = Z, estimator = custom_estim)
Learning the undirected graph with the undirected skeleton of X:
X = load_boston()['data']
find_neighbours(X)

Motivation

Conditional as well as multivariate independence testing are difficult problems lacking a straightforward, scalable and easy-to-use solution. This project connects the classical independence testing task to the supervised learning workflow. This has the following advantages:

  • The link to the highly researched supervised learning workflow allows classical independence testing to grow its power as a side effect of the improvement in supervised learning methodology
  • The sophisticated knowledge of hyperparameter-tuning in supervised prediction removes any need for hyperparameter tuning and manual choices prevalent in current methodology
  • As a wrapper for the sklearn package, the PCIT is easy to use and adjust

Installation

Can be installed through pip

pip install pcit

The dependencies are:

Tests

Three tests can be run:

Test_PCIT_Power: Tests the power for increasing sample sizes on a difficult v-structured problem. Matlab code for same problem to compare with the "Kernel Conditional Independence Test" can be found here

Test_PCIT_Consistency: Here the consistency under perturbations in the data is assessed.

Test_Structure: Here the power and false-discovery rate control of the graphical model structure learning algorithm are assessed

License

MIT License

pcit's People

Contributors

samburkart avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pcit's Issues

Test throws unexpected results

For some settings of the conditional independence test, the test results are wrong.

from numpy.random import normal
from pcit import MetaEstimator, StructureEstimation, IndependenceTest

for i in range(10):
z = normal(size=(2000,1))
x = normal(size=(2000,1))
y = normal(size=(2000,1))

print(IndependenceTest.PCIT(x=x,y=y, z = z, estimator = estimator)[1])

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.