Predictive Conditional Independence Testing (PCIT)

with applications in graphical model structure learning

Currently under review! See issues

Description

This package implements a multivariate conditional independence test and an algorithm for learning directed graphs from data based on the PCIT

arXiv preprint

Developers

Samuel Burkart: designated point of contact
Franz Kiraly

If you like to contribute, read our contribution guide.

Code Example

For details, see the Examples or the Manual

There are 3 main functions:

MetaEstimator: Estimator class used for independence testing
PCIT: Multivariate Conditional Independence Test
find_neighbours: Undirected graph skeleton learning algorithm

For the following, X, Y and Z can be univariate or multivariate

from sklearn.datasets import load_boston
data = load_boston()['data']
X = data[:,1:2]
Y = data[:,2:4]
Z = data[:,4:10]

Testing if X is independent of Y on a 0.01 confidence level

PCIT(X, Y, confidence = 0.01)

The direction of the prediction is X -> Y, and as such the p-values correspond to the hypothesis that adding X does not improve the prediction of Y (one for each dimension in Y). If the parameter 'symmetric' is set to True (default), both directions are tested.

Testing if X is independent Y, conditional on Z

PCIT(X, Y, z = Z)

Testing if X is independent of Y, conditional on Z, using a custom MetaEstimator, multiplexing over a manually chosen set of estimators:

from sklearn.linear_model import RidgeCV, LassoCV,
                    SGDClassifier, LogisticRegression

regressors = [RidgeCV(), LassoCV()]
classifiers = [SGDClassifier(), LogisticRegression()]
custom_estim = MetaEstimator(method = 'multiplexing',
                estimators = (regressors, classifiers))

PCIT(X, Y, z = Z, estimator = custom_estim)

Learning the undirected graph with the undirected skeleton of X:

X = load_boston()['data']
find_neighbours(X)

Motivation

Conditional as well as multivariate independence testing are difficult problems lacking a straightforward, scalable and easy-to-use solution. This project connects the classical independence testing task to the supervised learning workflow. This has the following advantages:

The link to the highly researched supervised learning workflow allows classical independence testing to grow its power as a side effect of the improvement in supervised learning methodology
The sophisticated knowledge of hyperparameter-tuning in supervised prediction removes any need for hyperparameter tuning and manual choices prevalent in current methodology
As a wrapper for the sklearn package, the PCIT is easy to use and adjust

Installation

Can be installed through pip

pip install pcit

The dependencies are:

Tests

Three tests can be run:

Test_PCIT_Power: Tests the power for increasing sample sizes on a difficult v-structured problem. Matlab code for same problem to compare with the "Kernel Conditional Independence Test" can be found here

Test_PCIT_Consistency: Here the consistency under perturbations in the data is assessed.

Test_Structure: Here the power and false-discovery rate control of the graphical model structure learning algorithm are assessed

License

MIT License

sktime / pcit Goto Github PK

pcit's Introduction

Predictive Conditional Independence Testing (PCIT)

with applications in graphical model structure learning

Currently under review! See issues

Description

Developers

Code Example

For details, see the Examples or the Manual

Testing if X is independent of Y on a 0.01 confidence level

Testing if X is independent Y, conditional on Z

Testing if X is independent of Y, conditional on Z, using a custom MetaEstimator, multiplexing over a manually chosen set of estimators:

Learning the undirected graph with the undirected skeleton of X:

Motivation

Installation

Tests

License

pcit's People

Contributors

Stargazers

Watchers

Forkers

pcit's Issues

Recommend Projects

Recommend Topics

Recommend Org