Comments (2)
Hi Tota,
Thank you for your question.
Your code seems fine overall.
Let me just present a tidier approach that:
- Relies on
dplyr::left_join()
to join tables (better than assuming that the number of rows across tables will match by position, in general they will not, although in your case it seems to work) - Genes are obtained from the
genes
table in theassociations
object, not from thegenomic_contexts
table from thevariants
object. Results should be similar, but thegenes
table reflects the genes to be annotated with that variant according to the authors of the study, whereas the genes listedgenomic_contexts
are generated by the GWAS Catalog team by automatic workflows. So, why am I getting the genes from thegenes
table? Just to show you that you could do differently. Stick with your approach if you prefer the annotation by the GWAS Catalog team instead of the original authors'. - Keeps all gene names associated with a locus, instead of only the first one.
With regards to efficiency, both methods are equivalent. The bottleneck is always the retrieval of the data from the GWAS Catalog. In your case we need to get associations and variants. The rest is just wrangling.
Feel free to ask more questions! Happy coding.
library(gwasrapidd)
study_id <- "GCST004132"
associations <- get_associations(study_id = study_id)
variants <- get_variants(study_id = study_id)
# Because there are more than on gene associated with a locus
genes <- associations@genes %>%
dplyr::group_by(association_id, locus_id) %>%
dplyr::summarise(gene_name = paste(gene_name, collapse = ' '), .groups = 'drop')
association_results <-
associations@associations %>%
dplyr::select(association_id, pvalue, beta_number, or_per_copy_number) %>%
dplyr::left_join(associations@risk_alleles, by = 'association_id') %>%
dplyr::left_join(genes, by = c('association_id', 'locus_id')) %>%
dplyr::left_join(variants@variants, by = c('variant_id')) %>%
dplyr::transmute(
study_id = study_id,
association_id = association_id,
ID = variant_id,
CHROM = chromosome_name,
POS = chromosome_position,
risk_allele = risk_allele,
gene_name = gene_name,
P = pvalue,
beta = beta_number,
OR = or_per_copy_number
)
association_results
#> # A tibble: 119 × 10
#> study_id association_id ID CHROM POS risk_allele gene_name P beta
#> <chr> <chr> <chr> <chr> <int> <chr> <chr> <dbl> <dbl>
#> 1 GCST0041… 19144286 rs34… 1 1.60e8 G SLAMF8 1e- 6 NA
#> 2 GCST0041… 19144332 rs25… 3 5.31e7 C intergen… 6e- 9 NA
#> 3 GCST0041… 19144360 rs56… 3 1.89e8 C LPP 6e-10 NA
#> 4 GCST0041… 19144385 rs80… 13 4.23e7 C AKAP11 4e- 8 NA
#> 5 GCST0041… 19144411 rs48… 22 3.69e7 C NCF4 2e- 8 NA
#> 6 GCST0041… 19144456 rs10… 16 8.28e7 A CDH13 1e- 9 NA
#> 7 GCST0041… 19713859 rs14… 7 5.03e7 <NA> C7orf72 … 9e-12 NA
#> 8 GCST0041… 19713864 rs12… 17 4.24e7 <NA> NAGLU ST… 2e-11 NA
#> 9 GCST0041… 19713869 rs51… 19 4.87e7 <NA> IZUMO1 N… 4e-11 NA
#> 10 GCST0041… 19713874 rs10… 2 4.36e7 <NA> THADA ZF… 4e-11 NA
#> # … with 109 more rows, and 1 more variable: OR <dbl>
from gwasrapidd.
Thank you very much, when I said more efficient I meant tidier, so this is perfect and exactly what I wanted!
from gwasrapidd.
Related Issues (20)
- Response code 500 when using get_studies() HOT 8
- parsing issue while using get_variants() HOT 10
- Gwascatcollect<-function(gene, chr=xx, start=xx, end=xx) HOT 3
- GRASP: Genome-Wide Repository of Associations Between SNPs and Phenotypes HOT 4
- institutional logo not rendered in footer
- revisit FAQ 5 HOT 1
- consider transitioning to the (not so) new pkgdown website template HOT 2
- Error: parse error: premature EOF in study responses
- Response code was 500. HOT 3
- About failing download the studies of "get_studies()“ HOT 6
- Error when running get_associations() HOT 4
- List of Variants to GWAS associations HOT 3
- Problem with obtaining the RAF for individual variants contained within associations with a haplotype
- Why are some gene names present in `genomic_contexts` but not in `ensembl_ids`? HOT 7
- The associations number obtained by "gwasrapidd" differs extremly from obtained in GWAS Catalog HOT 5
- How do I export `my_associations` to a table file, separated by tabs HOT 8
- FR: Add export functionality for gwasrapidd objects: `write_xlsx()`
- `get_studies()` not returning a scores object with efo id `"MONDO_0004648"`
- How download TSV file like on the web GWAS Catalog?or any function can do like that? HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gwasrapidd.