Code Monkey home page Code Monkey logo

predixcan's Introduction

Deprecation notice

This repository contains the original reference implementation of the PrediXcan method. It is now considered deprecated and exists only for reference purposes.

Active development is now conducted at the MetaXcan repository. Tutorial for this new version is here

PrediXcan

PrediXcan is a gene-based association test that prioritizes genes that are likely to be causal for the phenotype.

Do you have only summary results? Try MetaXcan, a new extension of PrediXcan that uses only summary statistics. No individual level data necessary.

Mailing List

Please join this Google Group for news on releases, features, etc. For support and feature requests, you can use this repository's issue tracker.

Reference

  • Gamazon ER†, Wheeler HE†, Shah KP†, Mozaffari SV, Aquino-Michaels K, Carroll RJ, Eyler AE, Denny JC, Nicolae DL, Cox NJ, Im HK. (2015) A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. doi:10.1038/ng.3367. (Link to paper, Link to Preprint on BioRxiv)

    †:equal contribution

    *:correspondence haky at uchicago dot edu

  • Alvaro Barbeira, Kaanan P Shah, Jason M Torres, Heather E Wheeler, Eric S Torstenson, Todd Edwards, Tzintzuni Garcia, Graeme I Bell, Dan Nicolae, Nancy J Cox, Hae Kyung Im. (2016) MetaXcan: Summary Statistics Based Gene-Level Association Method Infers Accurate PrediXcan Results link to preprint

  • Heather E Wheeler, Kaanan P Shah, Jonathon Brenner, Tzintzuni Garcia, Keston Aquino-Michaels, GTEx Consortium, Nancy J Cox, Dan L Nicolae, Hae Kyung Im. (2016) Survey of the Heritability and Sparsity of Gene Expression Traits Across Human Tissues. link to preprint

Software

Python version

  • Download software from this link

PredictDB

PredictDB hosts genetic prediction models of transcriptome levels to be used with PrediXcan. See our wiki for a report of a recent update of the prediction models.

Gene2Pheno database of results

G2Pdb, Gene to Phenotype database, hosts the results of PrediXcan applied to a variety of phenotypes. Link to prototype.

Genetic Architecture of Gene Expression Traits

  • Heather E Wheeler, Kaanan P Shah, Jonathon Brenner, Tzintzuni Garcia, Keston Aquino-Michaels, GTEx Consortium, Nancy J Cox, Dan L Nicolae, and Hae Kyung Im (2016) Survey of the Heritability and Sparsity of Gene Expression Traits Across Human Tissues Link to Preprint; correspondence hwheeler at luc dot edu and haky at uchicago dot edu
  • Database of heritability estimates link older link or older link

Acknowledgements

GTEx data

Data downloaded from dbGaP link

DGN RNA-seq data

Data downloaded from NIMH Repository and Genomics Resource

Battle, A., Mostafavi, S., Zhu, X., Potash, J.B., Weissman, M.M., McCormick, C., Haudenschild, C.D., Beckman, K.B., Shi, J., Mei, R., et al. (2014). Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Research 24, 14–24.


Analytics

predixcan's People

Contributors

hainswor avatar hakyim avatar heroico avatar hriordan avatar hwheeler01 avatar jiamaozheng avatar joseffrank avatar scottpdickinson avatar scottritchie73 avatar sritchie73 avatar themechanic23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

predixcan's Issues

Full DB for missing genes

Hi, I found that the gene I am interested is missing from predictDB.
And I think that this gene was removed due to FDR correction as I saw one related question from FAQ.
Can I get the full prediction model without missing genes? Maybe the older version contain all genes.

Thank you!

Give an example of gene list?

I am trying to provide a gene list but I am getting the following error:

 ./PrediXcan.py --predict --weights weights/gtex_v7_Whole_Blood_imputed_europeans_tw_0.5_signif.db --dosages genotype --samples samples.txt --genelist gene_inv8.txt --output_prefix output/inv8_001
2019-01-11 17:42:01.299286 Preloading weights...
2019-01-11 17:42:02.075148 Processing chr8.dosage.txt.gz
Traceback (most recent call last):
  File ".../PrediXcan.py", line 230, in <module>
    main()
  File ".../PrediXcan.py", line 211, in main
    transcription_matrix.update(gene, weight, ref_allele, allele, dosage_row)
  File "...PrediXcan.py", line 99, in update
    self.gene_list = self.get_gene_list()
  File ".../PrediXcan.py", line 93, in get_gene_list
    return list(sorted([line.strip().split()[-1] for line in open(self.gene_file)]))
IndexError: list index out of range

I am not sure about the error as I am putting the following gene list (2 examples tried):

chr8 ENSG00000227203.3
...

8 ENSG00000227203.3
...

and none of them worked.

Maybe if you put an example of a gene list I could get an idea?

I obtained the Ensembl_id_version name by using biomart.

PredictDB 1KG Data Availability

Dear PrediXcan Team,

I am trying to access the GTEx V6p 1000 genomes prediction databases from PredictDB. Whenever I try to download any of the .db or text files, I get an XML page with an error that says "access denied". Do you know what could be preventing me from accessing this data? Thank you in advance for your time! I can provide additional details if necessary.

Best Regards,
Zane

PrediXcan.py --predict issue

Hi,

I used the "--predict" function of PrediXcan.py to generate predicted gene expression. The first five runs were successful, but on the sixth run, the log information displayed only '2023-10-16 10:59:09.466391 Preloading weights...'. The program appeared to be running without progressing to subsequent steps, and no warning or error messages were generated. I would appreciate any suggestions or insights regarding this issue.

Best regards,
Zhenyao

Gene_list Attribute Error

I am trying to predict expression level using whole blood DB and got AttributeError.
I have my files in the right place and no idea what is wrong with my code.

Can anyone help me with this?

yjyang@HGCNT45:~/OFC/PrediXcan$ python PrediXcan.py --predict --dosages /home/yjyang/OFC/StudyData/GenotypeFiles/New_Impute/dosage --dosages_prefix cleft.chr --samples sample.txt --weights /home/yjyang/OFC/PrediXcan/GTEx-V6p-HapMap-2016-09-08/TW_Whole_Blood_0.5.db --output_prefix /home/yjyang/OFC/PrediXcan/Output/WB
2018-01-16 13:07:21.964476 Preloading weights...
Traceback (most recent call last):
File "PrediXcan.py", line 230, in
main()
File "PrediXcan.py", line 212, in main
transcription_matrix.save(PRED_EXP_FILE)
File "PrediXcan.py", line 117, in save
outfile.write('FID\t' + 'IID\t' + '\t'.join(self.gene_list) + '\n') # Nb. this lists the names of rows, not of columns
AttributeError: TranscriptionMatrix instance has no attribute 'gene_list'

convert_plink_to_dosage.py

I`ve obtained this error

AttributeError: GzipFile instance has no attribute 'exit'

I don't know what is happening...

predict.py error

Hello,
I have a question related to predixcan software.
I am trying to run predict.py script. But its showing error for me.
The command use:
python $PXCN_TOOLS/predict.py --model_db_path $MODELS/en_Whole_Blood.db --model_db_snp_key rsid --vcf_mode genotyped --vcf_genotypes $VCF_FILES/*.vcf --prediction_output $OUTPUT/GVDS_PrediXcan_Test_2021.txt
the error:
[E::bcf_hdr_parse] Could not parse the header, sample line not found
Segmentation fault
I checked the vcf files but they have the sample line:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG00105 HG00107 HG00115 HG00132 HG00145 HG00157 HG00181 HG00308 HG00365 HG00371 HG00379 HG00380 HG01789 HG01790 HG01791 HG02215 NA06985 NA07346 NA11832 NA11840 NA11881 NA11918 NA12005 NA12156 NA12234 NA12760 NA12762 NA12776 NA12813 NA18488 NA19092 NA19141 NA19143 NA19144 NA19153 NA19159 NA19184 NA19201 NA19206 NA19210 NA19214 NA20514 HG00096 HG00097 HG00099 HG00100 HG00101 HG00102 HG00103 HG00104 HG00106 HG00108 HG00109 HG00110 HG00111 HG00112 HG00114 HG00116 HG00117 HG00118 HG00119 HG00120 HG00121 HG00122 HG00123 HG00124 HG00125 HG00126 HG00127
I searched this error in google and it says to make sure that its tab delimited and not space. I checked the vcf files and its sample line is separated by tab.
Can you help me out with this?
Thank you.

AttributeError: TranscriptionMatrix instance has no attribute 'gene_list'

./PrediXcan.py --predict --assoc --linear
--weights weights/Brain_Frontal_Cortex_BA9/gtex_v7_Brain_Frontal_Cortex_BA9_imputed_europeans_tw_0.5_signif.db
--dosages genotype/
--dosages_prefix "chr22.dosage.txt.gz"
--samples genotype/chr22.txt
--pheno phenotype/phenotype.txt
--output_prefix results/chr22

2018-05-03 15:28:56.746566 Preloading weights...
2018-05-03 15:28:56.970715 Processing chr22.dosage.txt.gz
Traceback (most recent call last):
File "./PrediXcan.py", line 230, in
main()
File "./PrediXcan.py", line 212, in main
transcription_matrix.save(PRED_EXP_FILE)
File "./PrediXcan.py", line 117, in save
outfile.write('FID\t' + 'IID\t' + '\t'.join(self.gene_list) + '\n') # Nb. this lists the names of rows, not of columns

AttributeError: TranscriptionMatrix instance has no attribute 'gene_list'

How to handle the missing dosages?

Hi Predixcan guys,
How to handle the missing dosages, while creating the predixcan format dosage files ? If its 'NA', the script throws error saying cannot convert string to float NA.
Thanks
Veera

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.