Code and data related to
Kaufmann et al. (2019) Common brain disorders are associated with heritable patterns of apparent aging of the brain. Nature Neuroscience. https://doi.org/10.1038/s41593-019-0471-7
We provide pre-trained xgboost models for estimating brain age from a set of 1118 features. When you compile your features, please make sure the ordering follows the ordering in the file "feature-names.csv". The brainage prediction model takes a matrix as input, where each row reflects data from a given individual and each column reflects one of 1118 features. The relevant features can be extracted from the atlas introduced in Glasser et al (2016) Nature, as well as using freesurfer's asegstats2table.
Feature sets should be split into males and females and brain age can be estimated along the lines
brainAge_females <- predict(mdl_agepred_female, as.matrix(features_females))
brainAge_males <- predict(mdl_agepred_male, as.matrix(features_males))
! Note that several publicly available samples have been used to train these models. Only data not used in model training should be used when estimating brain age using these models !
First install xgboost if it is not already installed: install.packages(“xgboost”).
I then simulated data for 30 female participants with 1118 columns per participant to give a datapoint for each feature. The thickness values (first 360 columns) were potentially realistic (values between 2 and 5, representing thickness in mm). Most other values in the other columns are not remotely reliastic but my simulation was done to make sure the R code ran. Values from this fake data file were then saved in a new Excel file without any code.
I then opened the brainageModels.RData file in R Studio and imported the sample data.
library(readxl) features_females <- read_excel("new_test_females.xlsx", col_names = FALSE)
Then I ran the prediction models.
brainAge_females <- predict(mdl_agepred_female, as.matrix(features_females))
Here’s the example output
32.58700 33.52778 33.59647 36.02596 35.52674 35.84229 35.94399 32.85972 36.30823 34.95153 32.71458 37.07165 35.91000 34.21924 35.15902 35.61190 31.90588 40.65570 42.47514 38.92538 37.72836 32.12014 40.32568 35.16042 31.54249 35.46812 32.06247 29.09416 30.68690 34.22859
There is an error to fix when using a newer version of xgboost.
Warning message: In value[3L] : The model had been generated by XGBoost version 1.0.0 or earlier and was loaded from a RDS file. We strongly ADVISE AGAINST using saveRDS() function, to ensure that your model can be read in current and upcoming XGBoost releases. Please use xgb.save() instead to preserve models for the long term. For more details and explanation, see https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html
The summary statistics for a GWAS on brain age has been split using 7-zip due to file size. Simply select all six file parts and extract them. This will create a single text file called "Brainage_GWAS_sumstat_final".