There are two demo codes in this repository which showcases the practical performance of the graph estimator proposed in our paper "An approximate Bayesian approach to covariate dependent graphical modeling". The code discrete_covariate_demo.R considers the case of discrete covariates. The code cont_covariate_attempt2.R considers the toy example presented in the paper with continuous covariates. The cov_vsvb.R and ELBO_calculator are functions called by the demo codes. Specifically, the function cov_vsvb updates the variational parameters and returns the converged estimates for a single graph.
One can simply run the demo as is to get some demo examples and some visual results through a heatmap and histograms.
In this file, it is assumed that there are 2 discrete covariate levels. The data are generated from two different covariance matrices as an example, controlled by a parameter. Depending on whether
, we have the covariate independent model or the covariate dependent model. Set no. of subjects in study to be
n
and number of variables to be p+1
.
#1. Covariate independent model
#2. Covariate dependent model
The precision matrix . Similarly,
.
Let
, and
.
We generate
n/2
samples from and
n/2
samples from to form our dataset.
We generate an covariate matrix with entries =
-0.1
for each of the p variables for the population with precision matrix and entries =
0.1
for each of the p variables for the population with precision matrix . Thus for each variable, the covariate value for an individual is univariate. As an example, the covariate attached to the FOXC2 protein expression of patient 1 is the univariate FOXC2 RNA expression for the same patient. If instead we used both the RNA expression and CNV expression for FOXC2 gene for the same patient, we would have a two-dimensional covariate attached to the data.
- Fix a variable
j
as response, and the remainingp
variables as predictor. (Recall there arep+1
variables total. - From the covariate matrix, define an
weight matrix where the
i
th row describes the weight vector associated with the n subjects relative to subjecti
. The weights for this model are chosen with an ad-hoc bandwidth value of 0.1. Technically, one can perform a density estimation on the covariate space, but since its basically discrete, we choose a small value of 0.1 - Choose the hyperparameter values
and
following Carbonetto Stephens and
over a grid.
- Call the cov_vsvb function to update the following variational parameters : The
matrices alpha, mu and S_sq where the
i
th row corresponds to the inclusion probability of thep-1
predictor variables, mean and standard deviation for thei
th subject. - Loop over the
p
variables as response to get thematrices corresponding to each of the
n
subjects in the study. - Assume that the diagonal elements in the inclusion probability matrices for each individual is
0
, and apply the post processingto symmetrize the matrix.
- Set the dependence graph to be
.
The code calls the cov_vsvb function, which updates the variational parameters and returns the final values of the estimates corresponding to a single graph which corresponds to a fixed individual in the study. By going through a loop, one can calculate the graph estimates corresponding to every individual in the study, and can also be parallelized since the updates are independent. However, the parallelization is yet to be implemented. The cov_vsvb function itself calls the ELBO_calculator function, which calculates the ELBO corresponding to the current values of the variational parameters for a specific graph corresponding to a single individual. Note the contribution of every individual in the study to the ELBO of the parameters corresponding to a single individual, facilitating the borrowing of information.
In this file, instead of discrete covariate values, we have three clusters of covariate values, and the individuals in the study have covariate values belonging to one of the three clusters.
We set n=180
and p=4
, i.e. there are 5
variables in this toy example. We have univariate covariates associated with every individual.
Z ~ Uniform (-1,-0.3) U (-0.23, 0.33) U (0.43, 1)
corresponding to three well-separated clusters.
We have the precision matrix defined as a function of the covariate Z, through the var_cont function as
where
The rest of the steps are identical to the discrete covariate case.