jfortin1 / neurocombat Goto Github PK
View Code? Open in Web Editor NEWHarmonization of multi-site imaging data with ComBat (Python)
License: MIT License
Harmonization of multi-site imaging data with ComBat (Python)
License: MIT License
I'm trying to use NeuroCombat for segmentation harmonisation. I have GM, WM, and CSF segmentations for many subjects and sites, and am using the individual voxel values as features. As expected, there are voxels that end up being zero across all sites and subjects even if skull-stripping is performed: E.g.: For the CSF maps, in the central WM regions the voxel values are all zero for all images.
My understanding is that this poses problems for the NeuroCombat model as these features will have zero mean and variance. I can of course simply remove those voxels that exhibit this behaviour, but I wanted to check that this is the best course of action before doing so.
neuroCombat/neuroCombat/neuroCombat.py
Line 107 in cab1853
neuroCombat/neuroCombat/neuroCombat.py
Line 95 in cab1853
The following line solves the bug:
ref_indices = np.argwhere(covars[:, batch_col] == ref_batch).squeeze()
From @FinLouarn at ncullen93/neuroCombat#6
The first example ¨Correcting from Numpy Array as Data¨ works smoothly,
but the second example ¨Correcting from pandas.DataFrame as Data¨ fails with the following error of shape not aligned, unless you feed neuroCombat with data.T in stead of data.
python 3.7.7
pandas 0.25.3
ValueError Traceback (most recent call last)
in
15 batch_col=batch_col,
16 discrete_cols=discrete_cols,
---> 17 continuous_cols=continuous_cols)
/neuroCombat/neuroCombat/neuroCombat.py in neuroCombat(data, covars, batch_col, discrete_cols, continuous_cols)
97 # standardize data across features
98 print('Standardizing data across features..')
---> 99 s_data, s_mean, v_pool = standardize_across_features(data, design, info_dict)
100
101 # fit L/S models and find priors
/neuroCombat/neuroCombat/neuroCombat.py in standardize_across_features(X, design, info_dict)
159 sample_per_batch = info_dict['sample_per_batch']
160
--> 161 B_hat = np.dot(np.dot(la.inv(np.dot(design.T, design)), design.T), X.T)
162 grand_mean = np.dot((sample_per_batch/ float(n_sample)).T, B_hat[:n_batch,:])
163 var_pooled = np.dot(((X - np.dot(design, B_hat).T)**2), np.ones((n_sample, 1)) / float(n_sample))
<array_function internals> in dot(*args, **kwargs)
ValueError: shapes (8,57) and (22283,57) not aligned: 57 (dim 1) != 22283 (dim 0)
Warning message:
In xtfrm.data.frame(x) : cannot xtfrm data frames
How to fix this warning message?
Hello,
We are trying to harmonize DTI maps of subjects scanned in 6 different sites. Since the participants are in majority patients, to avoid any bias due to the disease we would like to consider only the healthy controls and then, for each site, apply the estimated harmonization to all the other subjects.
Looking at the code, we saw that the function “adjust_data_final” employs the parameters returned in the “estimates” dictionary however in input it does not take the original data but “s_data”. So, to solve our problem we thought to do the following:
Is this approach correct or is there a more straightforward way to do it?
Thank you for the help.
Best,
Ilaria
Hello, we are trying to run Combat with "raw" BOLD images (4D images per subject). The datasets are from 2 different sites, one used a TR of 2sec and the other used TR of 0.8sec. This means that the number of volumes, in the same time range, would be more for the 0.8sec TR. One dataset has 177 volumes (2sec TR) and the other has 450 volumes (0.8sec TR). In summary is it possible to use a non-square matrix to run Combat for these two datasets? The variable in question is the "dat" variable with a (p x n) composition.
Dear Jean-Phillippe,
I was wondering if I can also use neuroCombat when my ref_batch has a smaller number of subjects than the batch I want to harmonise. My reference batch is 94 subjects and the batch I want to harmonise is 112 subject, with both 105 radiomic features. How do I implement this in the main code? Thank you for your time!
Greetings, Lieke
I was a new user of NeuroCombat. We have multi-site dwi image, before Combat DTI matrices showed site effect, combat was worked great to removed site effect. For the same data set we also have a measurement from Mrtrix3, this matrix didn't have site effect before combat, however, I thought may just apply the approach for all the matrices. For this Mrtrix3 matrix, we have found some negative value after combat, which is feasible. How this could happen and is there a way to control not being negative?
Jian
Hello, I was going through the source code and found the NeuroCombatFromTraining function. It currently says it is under development but it seems to be working alright when I tested it in a couple of examples.
Is the tag under development there because it has not been fully validated or are there parts that still need to be added? It looks pretty complete from my cursory look at it.
I see the documnets of R implementation, the demographic infomation will be protected. I don't know the condition in the Python implementation.
Hi,
I would like to try Combat for harmonizing results of clinical data from quite a lot of scanners. However, I am struggling with the right inputs. Are sample or dummy data available? Or could you elaborate a bit more, like in the short Matlab script / tutorial?
Best,
Falk
Please let me know what is the scanner/batch covariate. Thanks
Dear Jean-Philippe,
I was wondering what effect adding variables to be preserved has on the outcome.
I am working with volumetric data in a disorder. So far, I have preserved sex and disorder stage as categorical as well as age and eTIV as continuous variables. As my model appears to be different if leaving out e.g. disorder stage or eTIV as a preserved variable, I was wondering whether there are some criteria which variables should be preserved in the model?
As I regress out eTIV, age, and sex lateron in an ANCOVA anyways, I have trouble understanding whether and why they should be preserved in ComBat.
Any comments on that are appreciated!
Best, Melissa
Dear neuroCombat expert,
I have three fMRI datasets preprocessed with fMRIprep having different volumes due to different TR: 1) 91x109x91x192 (TR=2510, slice thickness=2.51); 2) 91x109x91x570 (TR= 720, slice thickness=0.75); 3) 91X109X91X240 (TR=2000, slice thickness=2). When I convert the 4D image to a vector I have different vector lengths. Given neuroCombat requires the same vector length, I wonder how can I resclice/resample the data to have the same vector length. Any suggestions are appriciated!
Best wishes
Joyce
Hi there,
Thank you for this amazing project. I was using this ADHD200 dataset to remove site effects. I computed Pearson coefficients for AAL functional connectivities as inputs and some sites have provided biological features such as gender, handedness and IQs. However, when calling the neuroCombat , I got the following error on this function get_beta_with_nan
:
LinAlgError:Singular matrix
neuroCombat/neuroCombat/neuroCombat.py
Lines 211 to 221 in ac82a06
I am no experts in maths, but according to this solution, it can be attributed to the same rows in the matrix so the inversion does not exist. I didn't fully understand this part of the code, but I guess I can have some patients who have the same biological features and that caused the issue? (assumption though). After I changed la.inv
into la.pinv
everything works fine now. I was wondering if you could have better insights towards this and whether modification is solid.
Thanks for the help :)
Hey Jean-Philippe,
thanks for the nice and helpful software! :) I have successfully used neuroCombat for Desikan-Killiany ROIs already, but I was trying to extend the analysis to voxel-wise data. However, the doc says that dat has to be of shape = (features, samples) e.g. cortical thickness measurements, image voxels. How do I put 3D voxel information into 1 dimension? How can I input voxel-wise data per subject into neuroCombat?
Best, Melissa
Hi @Jfortin1!
Happy to greet you! First of all, thank you very much for your contributions in the field of harmonization!
I'm trying to use neuroHarmonize that run over combat, but I have an error that I've seen that you have solved before:
I've tried to modify the combat.py code but I can't get the whole implementation to work.
Could you indicate me which lines of code should I modify please?
The problem is in line 190 in harmonizationLearn.py
Thank you very much in advance!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.