myles-lewis / locuszoomr Goto Github PK
View Code? Open in Web Editor NEWA pure R implementation of locuszoom for plotting genetic data at genomic loci accompanied by gene annotations.
License: GNU General Public License v3.0
A pure R implementation of locuszoom for plotting genetic data at genomic loci accompanied by gene annotations.
License: GNU General Public License v3.0
Hello,
I am using the latest version (locuszoomr_0.1.5
via devtools::install_github("myles-lewis/locuszoomr"
)) and encountered the following error when trying to run locus_ggplot
:
Error in `check_breaks_labels()`:
! `breaks` and `labels` must have the same length
The interesting thing is, that locus_plot
and locus_plotly
work without any issue, so I dont think anything is "wrong" with the way I have prepared my data. I have
I have uploaded a small, randomized dataset (>50kb) with which one can reproduce the error:
https://we.tl/t-eBpXs3UVAh
tmp <- data.table::fread("path/to/file")
loc_test <- locus(
data = as.data.frame(tmp),
chrom = "Chr",
pos = "bp",
p = "pC",
labs = "SNP",
seqname = chr,
flank = 0,
xrange = c(start, stop),
ens_db = "EnsDb.Hsapiens.v75",
LD = "1_247603584_C_T"
)
locus_plot(loc_test) #works
locus_plotly(loc_test) #works
locus_ggplot(loc_test) #doesnt work
Thank you very much in advance!
I'm getting the following error message when using the link_recomb function:
APP.recomb <- link_recomb(APP.test)
Retrieving recombination data from UCSC
Error in normArgTable(value, x) : Table 'recomb1000GAvg' is unavailable
Hello,
I am currently running a colocalization analysis between two traits. For this, I like to stack the plots on top of one another. This works perfectly with the example in your vignette:
library(locuszoomr)
data(SLE_gwas_sub) ## limited subset of data from SLE GWAS
library(EnsDb.Hsapiens.v75)
loc <- locus(gene = 'UBE2L3', SLE_gwas_sub, flank = 1e5, ens_db = "EnsDb.Hsapiens.v75")
set_layers(2)
scatter_plot(loc)
scatter_plot(loc)
genetracks(loc)
However, in my case, I need to use gg_scatter
instead of the scatter_plot
option, because I need to do some slight modifications to my plot (I want to highlight certain points with circles and this is just much easier to do with a ggplot object). The problem that I am running into is, that when I have two ggplot objects, I cant properly align them above the gene tracks... Here is my code:
library(patchwork)
p1 <- gg_scatter(loc)
p2 <- gg_scatter(loc)
genes <- gg_genetracks(loc)
wrap_plots(p1,p2,genes, ncol=1)
Do you have any ideas how to accomplish this? I think it would be incredibly helpful because it would very much increase the flexiblity of the package.
Any chance you can include an option in the link_LD()
function to choose the genome build as LDlinkR now supports multiple builds? Thanks!
I did notice the other recent issue with the link_recomb
function, but that seems to be separate to the issue I have encountered.
Essentially after running the following:
analysis_locus = locus(index_snp = currrsid, data = plot_frame, flank = 5e5,
chrom = "chr", pos = "pos", p = "p", ens_db = ensDb_v111)
analysis_locus_ld = link_LD(analysis_locus, token = "<MYTOKEN>")
analysis_locus_ld = link_recomb(analysis_locus_ld, genome = "hg38")
I got this in return:
Retrieving recombination data from UCSC
Warning messages:
1: In curlSetOpt(..., .opts = .opts, curl = h, .encoding = .encoding) :
Error setting the option for # 3 (status = 43) (enum = 81) (value = 0x3b408e000): A libcurl function was given a bad argument CURLOPT_SSL_VERIFYHOST no longer supports 1 as value!
2: In curlSetOpt(..., .opts = .opts, curl = h, .encoding = .encoding) :
Error setting the option for # 3 (status = 43) (enum = 81) (value = 0x3bf625180): A libcurl function was given a bad argument CURLOPT_SSL_VERIFYHOST no longer supports 1 as value!
Making the LocusZoom plot with locus_plot
still worked as expected and produced the same plots as before, with a "Recombination %" axis. However, there were no recombination rate blue lines within the plot.
Incidentally, this package is excellent! Really nice to use, thanks so much!
Hello, I appreciate for this great tool.
I have several issues while I draw plots locuszoomr and arrange other plots with cowplot.
First, when I get recombination rate, it always return error.
Here is my code.
> loc <- locus(data = tb_1, seqname = 5, xrange = c(96207000, 96259000), flank = 1e5, ens_db = "EnsDb.Hsapiens.v75")
chromosome 5, position 96207000 to 96259000
186 SNPs/datapoints
> loc <- link_LD(loc, token = my_token)
Obtaining LD on 186 SNPs
LDlink server is working...
> tb_1$r2 <- loc$data$ld
> loc <- link_recomb(loc, genome = "hg19")
Retrieving recombination data from UCSC
Error in .local(object, ...) : tabular output format not available
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] cowplot_1.1.1 EnsDb.Hsapiens.v75_2.99.0 ensembldb_2.10.2 AnnotationFilter_1.10.0 AnnotationHub_2.18.0
[6] BiocFileCache_1.10.2 dbplyr_2.1.1 locuszoomr_0.2.1 GenomicFeatures_1.38.2 AnnotationDbi_1.48.0
[11] Biobase_2.46.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.1 IRanges_2.20.2 S4Vectors_0.24.4
[16] ggbio_1.34.0 ggplot2_3.5.0 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] colorspace_2.0-2 ellipsis_0.3.2 biovizBase_1.34.1 htmlTable_2.4.1
[5] XVector_0.26.0 base64enc_0.1-3 dichromat_2.0-0.1 rstudioapi_0.15.0
[9] farver_2.1.0 remotes_2.4.2.1 ggrepel_0.9.1 bit64_4.0.5
[13] interactiveDisplayBase_1.24.0 fansi_0.5.0 splines_3.6.3 cachem_1.0.5
[17] knitr_1.45 jsonlite_1.8.8 Formula_1.2-5 Rsamtools_2.2.3
[21] cluster_2.1.4 png_0.1-7 graph_1.64.0 shiny_1.6.0
[25] BiocManager_1.30.22 compiler_3.6.3 httr_1.4.7 backports_1.2.1
[29] assertthat_0.2.1 Matrix_1.5-1 fastmap_1.1.0 lazyeval_0.2.2
[33] cli_3.6.2 later_1.2.0 htmltools_0.5.7 prettyunits_1.1.1
[37] tools_3.6.3 gtable_0.3.0 glue_1.7.0 GenomeInfoDbData_1.2.2
[41] reshape2_1.4.4 dplyr_1.1.4 rappdirs_0.3.3 Rcpp_1.0.7
[45] vctrs_0.6.5 Biostrings_2.54.0 rtracklayer_1.46.0 xfun_0.42
[49] stringr_1.5.1 mime_0.11 lifecycle_1.0.4 XML_3.99-0.3
[53] zoo_1.8-9 zlibbioc_1.32.0 scales_1.3.0 gggrid_0.2-0
[57] BSgenome_1.54.0 VariantAnnotation_1.32.0 promises_1.2.0.1 hms_1.1.0
[61] ProtGenerics_1.18.0 SummarizedExperiment_1.16.1 RBGL_1.62.1 LDlinkR_1.3.0
[65] RColorBrewer_1.1-2 yaml_2.2.1 curl_5.2.1 memoise_2.0.1
[69] gridExtra_2.3 biomaRt_2.42.1 rpart_4.1.19 latticeExtra_0.6-29
[73] stringi_1.7.3 RSQLite_2.3.1 BiocVersion_3.10.1 checkmate_2.3.1
[77] BiocParallel_1.20.1 rlang_1.1.3 pkgconfig_2.0.3 matrixStats_0.59.0
[81] bitops_1.0-7 lattice_0.20-45 purrr_1.0.2 GenomicAlignments_1.22.1
[85] ggstats_0.5.1 htmlwidgets_1.5.3 labeling_0.4.2 bit_4.0.4
[89] tidyselect_1.2.0 GGally_2.2.1 plyr_1.8.6 magrittr_2.0.3
[93] R6_2.5.1 generics_0.1.0 Hmisc_4.7-2 DelayedArray_0.12.3
[97] DBI_1.1.1 pillar_1.9.0 foreign_0.8-81 withr_3.0.0
[101] survival_3.5-5 RCurl_1.98-1.3 nnet_7.3-19 tibble_3.2.1
[105] crayon_1.4.1 utf8_1.2.1 plotly_4.10.4 OrganismDbi_1.28.0
[109] jpeg_0.1-8.1 progress_1.2.2 grid_3.6.3 data.table_1.14.0
[113] blob_1.2.1 digest_0.6.34 xtable_1.8-4 httpuv_1.6.1
[117] tidyr_1.3.1 openssl_2.1.1 munsell_0.5.0 viridisLite_0.4.0
[121] askpass_1.1
How can I fix it? Can I get recombination rate from UCSC, and insert in the data to visualize?
Second, I'm trying to align regional plot from locuszoomr and gene model from ggbio by using cowplot. Aligning of y-axis was successful, but the ranges of x-axis didn't match. Is there any option to control xlim in ggplot version of locuszoomr?
Following codes were used to align the plots.
loc <- locus(
data = tb,
seqname = 5,
xrange = c(96207000, 96259000),
fix_window = 0,
ens_db = "EnsDb.Hsapiens.v75"
)
p1 <- gg_scatter(loc)
p2 <- gg_genetracks(loc)
# load window
df <- read.table(
"deletion_window.bed",
sep = "\t",
header = FALSE,
stringsAsFactors = FALSE
)
bed <- sapply(
split(df, df$V4),
function(i) {
GRanges(
seqnames = i$V1,
ranges = IRanges(
start = i$V2,
end = i$V3,
names = i$V4
)
)
}
)
# load gene model
txdb <- makeTxDbFromGFF(
"gene_model.gtf",
format = "gtf"
)
p3 <- autoplot(txdb, which = bed$gene)
plot_grid(p1, p2, p3@ggplot, ncol = 1, align = "v")
It's resulting plot.
Hey Myles,
its me (yet again)...
I am currently using AnnotationHub
to get the latest ens_db
object for GRCh38. I noticed, that for some loci, I get the message "No genes to plot". However, I know for a fact that there are genes in the region. I am also able to manually search the ens_db
object and list the genes that should be plotted.
This is code to reproduce the issue:
ah <- AnnotationHub()
# this is the latest ensembl annotation
# query(ah, c("EnsDb", "v109"))
ensDB_v109 <- ah[["AH109606"]]
x <- fread("example.tsv")
chr=1
start = 247416156 - 110e3 # these are different from the coloc_susie script because we use hg38 here
stop = 247449108 + 110e3
tmp_loc <- locus(
data =x,
seqname = "chr1",
chrom = "Chromosome_phe",
xrange = c(start, stop),
flank = 0,
p = "Nominal_pvalue",
pos = "Pos_variant",
ens_db = ensDB_v109,
labs = "gene_snp")
locus_plot(tmp_loc) # -> No genes to plot
# Manually look up the ensDB object:
genes(ensDB_v109) %>% as.data.frame() %>%
filter(start > 247416156 - 110e3) %>%
filter(start < 247449108 + 110e3) %>% View() # shows the genes that I expect to be in the region.
The example data can be downloaded here:
https://we.tl/t-wgfpRz9aHd
As always, any insights are much appreciated!
Cheers,
Nick
Hi Myles,
I've encountered a syntax error while using the locus function, and I'm having trouble diagnosing the problem. The error occurs when I execute the following code:
loc <- locus(data = gwas_select, seqname = chr, xrange = range,ens_db = "EnsDb.Hsapiens.v75")
Chromosome 9, position 91453236 to 91553236
1214 SNPs/datapoints
Error: near ")": syntax error
I have reviewed the gwas_select object, and it appears to be consistent with other objects that I have successfully processed without encountering this issue. For your reference, here is the output for the chr and range variables:
> chr
[1] 9
> range
[1] 91453236 91553236
and the gwas object is gwas.txt
Could you please help me identify what might be causing this syntax error? Any insights or suggestions you could provide would be greatly appreciated.
Thank you!
Xiaotong
Dear Myles,
I really fell in love with your package during the last weeks, due to the great usability, documentation and also your support! There is just one small thing, which would make this package stand out even more. And that would be the inclusion of the recombination rate in the locus zoom plot as a secondary axis. I am imaging something similar to the code from gasscoplot2
(https://github.com/jrs95/gassocplot2/blob/master/R/figures.R).
Are there any plans to implement such a functionality in the future?
Cheers!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.