david-barnett / microviz Goto Github PK

R package for microbiome data visualization and statistics. Uses phyloseq, vegan and the tidyverse. Docker image available.

Home Page: https://david-barnett.github.io/microViz/

License: GNU General Public License v3.0

R 99.11% TeX 0.86% Dockerfile 0.04%

microbiome microbiome-analysis microbiota rstats r

microviz's People

Stargazers

Watchers

Forkers

erikaganda devindrown rowling2392 khemlalnirmalkar joshualiuxu kamilamurawska gandalab sukses24 ahderojas

microviz's Issues

ord_explore shape errors

part of JOSS review issue #25

Shape gives cryptic error (from ggiraph) when set to some variables:

ggiraph interactive plots cannot handle composite shapes
solution is to manually specify palette without composite shapes
- to include shapes of only one type this is limited to 5 shapes: circle filled, triangle filled, square filled, diamond filled, triangle down filled
- mapping NA to: circle open (instead of blank)
- identify variables with only max 5 unique values (and possibly NA)

easy way to generate fixed palette for taxa

useful for comp_barplot (and ord_plot_iris)
currently too tricky for a user to manually specify a palette, but being able to generate a template vector for further modification should help a lot
should be easy enough with tax_top and distinct_palette
could be called tax_palette?

Statement of need

Hi David,

Upon reading through your statement of need, I had some questions. I went by each statement to examine the novelty added by your tool, and maybe you can comment on these:

Statement of need: microViz extends the set of distance measures (including e.g. Generalized UniFrac and Aitchison) of phyloseq and microbiome
- Question: phyloseq has 44 supported distance methods  including UniFrac, what is provided that is not already included, and what additional uses do those distance measures provide?
Statement of need: microViz makes use of further ordination calculation and visualization methods more accessible, such as (distance-based) redundancy analysis (RDA) and canonical correspondence analysis (CCA)
- Question: the microbiome R package has options for RDA and CCA , what additional functionality does microViz provide?
Statement of need: microViz generates bi-plots and tri-plots with the ggplot2 R package
- Question: the other standard microbiome packages generate bi-plots, what additional functionality does microViz provide?

ps_dedupe removes all rows when using multiple variables

Hi David,

Thank you for sharing this package, it has a lot of really helpful functions for working with phyloseq objects! It's made my life a lot easier in the past couple of weeks.

I've come across an issue while using ps_dedupe to prune my dataset. When I dedup by Lab ID (single variable), it seems to work as expected, leaving me with one entry per Lab ID. I then try to dedup by multiple variables (Individual ID and date, i.e. vars = c("Individual.ID", "year", "month", "day")), and it seems to be removing ALL rows from groups with >1 row, rather than leaving me just one row per Individual/date. I'm not sure if this is user error or something in the code, but I wondered if anyone else has had the same issue in the past.

#dedup LabID
ps.sp <- ps_dedupe(ps.sp, "LabID", method = "readcount")
nsamples(ps.sp)
test1 <- sample_data(ps.sp)

#dedup by Ind and date
ps.sp2 <- ps_dedupe(ps.sp, vars = c("Individual.ID", "year", "month", "day"), method = "readcount")
nsamples(ps.sp2)
test2 <- sample_data(ps.sp2)

I can send an .Rdata file with ps.sp separately, in case reproducing the issue will help. Any thoughts on what might be happening would be very appreciated!

Thank you,
Amy

Fast Unifrac

Hey,

(great package and awesome tutorial!)

It would be great to have a fast version of calculating unifrac distances...

Cannot install from github, gives timeout error.

Following the installation instructions I typed the command:

> devtools::install_github("david-barnett/[email protected]")

Which gives the output:

Error: Failed to install 'unknown package' from GitHub:
  Timeout was reached: [api.github.com] Resolving timed out after 10000 milliseconds

I already tried with libcurl and wininet methods, but doesnt seem to work.

ord_get() extract Eigenvalues for all the samples

Hi,

I'm trying to extract eigenvalues for every sample with ord_get() function, so that I can use PC1 values of each sample to build a lm(diagnosis ~ PC1) model to get a p value for the PCA plot. But when I run ord_get() on ps_extra object, I only get:

Call: rda(formula = OTU ~ 1, data = data)

              Inertia Rank
Total           309.8     
Unconstrained   309.8  116
Inertia is variance 

Eigenvalues for unconstrained axes:
  PC1   PC2   PC3   PC4   PC5   PC6   PC7   PC8 
50.56 26.64 20.40 14.73 11.55 10.95  9.00  8.43 
**(Showing 8 of 116 unconstrained eigenvalues)**

So I wondered how can I get PC values of other 115 samples? Or do you have any suggestion on how to get p value for PCA plot?

Thanks!
Leran

extract ps_cor as helper inside cor_heatmap

produces matrix
can be isolated from rest of cor_heatmap
might be useful standalone?

Beta-binomial regression example error: Error in (function (formula, phi.formula, data, link = "logit", phi.link = "logit", : Model could not be optimized! Try changing initializations or simplifying your model.

Hi,

When I try this example code:

ps0 %>% 
  tax_fix() %>% 
  tax_prepend_ranks() %>% 
  tax_filter(min_prevalence = 0) %>% 
  taxatree_models(
    type = corncob::bbdml, 
    ranks = c("Phylum", "Class", "Order", "Family"),
    variables = c("Treatment","Sample_Site","Time")
  
  )

I got this error message:

Error in (function (formula, phi.formula, data, link = "logit", phi.link = "logit",  : 
  Model could not be optimized! Try changing initializations or simplifying your model.

I'm not sure what does this error mean?

Thanks!
Leran

ggirafe interactivity on ord_explore

better selection
2 linked ord plots?
interactive bars?

Installation from official source

Congratulations on this cool package. The only thing I could wish for is to have it installable from an official repository.
Is this planned?

I managed to build it but I want to make sure that others can also build it easily.

Alternatively can I somehow the complied binary share with my collegues?

comp_barplot() option not to scale to 1

Hi David,

Thanks for the awesome package. I was wondering if we could skip the scaling to barplot to 1 using comp_barplot(). The rationale behind would be to be able to plot the proportion of selected ASV/taxa and plot their actual proportions.

I could not find the options from the actual arguments.

Thanks.

heat_grid default lwd of 0.5 not respected on png device

?png indicates that lwd parameter of less than 1 will be treated as 1, and that 1 pixel is also a minimum width.
Large heatmaps (many cells) look better with lwd < 1, so this is an annoying issue.

png device is default for R markdown html output, on Windows at least, it seems.
The solution is to use another device, such as CairoPNG or ragg_png.

It seems the default device cannot be changed for rmd notebook inline chunk output.

I should note solutions/workarounds somewhere in the comp_heatmap documentation.

knitr::opts_chunk$set(dev = "ragg_png") is a good fix for rmd documents.
The default device for rstudio can also be changed in global options

heatmaps can't show only one variable/sample/taxon without suppressing seriation

workaround: set seriation_method = "Identity" or sample/tax_seriation = "Identity"

solution: set relevant seriation method to Identity automatically if only one col/row detected

Support for Robust CLR and Robust Aitchison

Hi,

it would be great to have support for robust CLR and robust Aitchison. Both are these are now implemented in vegan (vegandevs/vegan#458).

Thanks!
Bela

ord_explore button to get code for ord_plot

allow reproducing ord_plot in script
modal dialog to appear on action button click

ord_explore stat_ellipse button for when colour is set to a variable

conditional panel to appear with ellipse checkbox

RShiny app problems

When testing out the RShiny app functionality with the corncob dataset, few of the Selections worked with the app or didn't change the plot. For example, for 'shape', the race option (in addition to many others) generated the error: 'Error: ids don't have the same length than str (most often, it occurs because of clipping)'. Also, the lasso it doesn't always allow accurate selection - the data points I encircle are not the only points selected. Not sure why some of the time it works and other times it doesn't.

Mixed models using `taxatree_models()`

Hi @david-barnett,

Thanks again for your effort in developping the package.
How would you recommend to run mixed models thourgh taxatree_models?

Following your tutorial, I am not able specify type = lmer

lm_models <- phylo %>% 
  tax_fix() %>% 
  tax_prepend_ranks() %>% 
  # it makes sense to perform the compositional transformation BEFORE filtering
  tax_transform("compositional", rank = "Genus", keep_counts = TRUE) %>% 
  tax_filter(min_prevalence = 0.1, undetected = 0, use_counts = TRUE) %>% 
  tax_transform(
    trans = "log2", chain = TRUE, zero_replace = "halfmin"
  ) %>% 
  taxatree_models(
    type = lmer, 
    ranks = NULL, # uses every rank available except the first
    # variables = c("UC", "female", "antibiotics", "steroids", "age_scaled")
    formula = ~ UC    + (1| female)
    
  )

Proportional min_prevalence given: 0.1 --> min 7/67 samples.
2022-06-02 10:33:27 - modelling at rank: Phylum
Error in `[[<-`(`*tmp*`, "call", value = f) : 
  [[<- defined for objects of type "S4" only for subclasses of environment

Using the R random effect syntax and type = lm does not generate any issue.

lm_models <- phylo %>% 
  tax_fix() %>% 
  tax_prepend_ranks() %>% 
  # it makes sense to perform the compositional transformation BEFORE filtering
  tax_transform("compositional", rank = "Genus", keep_counts = TRUE) %>% 
  tax_filter(min_prevalence = 0.1, undetected = 0, use_counts = TRUE) %>% 
  tax_transform(
    trans = "log2", chain = TRUE, zero_replace = "halfmin"
  ) %>% 
  taxatree_models(
    type = lm, 
    ranks = NULL, # uses every rank available except the first
    # variables = c("UC", "female", "antibiotics", "steroids", "age_scaled")
    formula = ~ UC    + (1| female)
  )

However, the random effect is not evaluated - which is not surprising.

lm_models %>% 
  taxatree_models2stats() %>% 
  .["taxatree_stats"]


$taxatree_stats
# A tibble: 268 × 7
   term           taxon            rank  estimate std.error statistic  p.value
   <fct>          <chr>            <fct>    <dbl>     <dbl>     <dbl>    <dbl>
 1 UC             P: Firmicutes    Phyl…   -47.4     10.5       -4.50  2.84e-5
 2 1 | femaleTRUE P: Firmicutes    Phyl…    NA       NA         NA    NA      
 3 UC             P: Bacteroidetes Phyl…   -16.7      2.92      -5.72  2.97e-7
 4 1 | femaleTRUE P: Bacteroidetes Phyl…    NA       NA         NA    NA

Thanks.

ancombc for tax_model

on cran

generalise model_tax_corncob to any formula-based stats method

Needs args:

formula (as a string?)
method → switch statement or pass actual function?
document as.formula methods required for user to parse string variables

and a model2stats general function?

Errors making correlation heatmaps

Hi David,

I want to make correlation heatplots for my thesis about the sample data (age, gender, BMI, ect) and microbiome data.
First i made the data numeric (object ps.minzeros.compositional.num).
I tried to make the correlation heatmap according to your tutorial, but I got several errors.

This code:
cor_heatmap(
data = ps.minzeros.compositional.num,
taxa = tax_top(ps.minzeros.compositional.num, 15, by = max, rank = "Genus"),
vars=c('female', 'older', 'highWeight', 'obesitas', 'highRatio', 'highBodyFat', 'highTG', 'highScore'),
cor='spearman')

gives this error:
Error in as(x, "matrix")[i, j, drop = FALSE] : subscript out of bounds
In addition: Warning message:
In ps_counts(data, warn = TRUE) : otu_table of counts is NOT available!
Available otu_table contains non-zero values that are less than 1

And another attempt:
ps.minzeros.compositional.num %>%
tax_agg("Genus") %>%
cor_heatmap(vars = c('female', 'older', 'highWeight', 'obesitas', 'highRatio', 'highBodyFat', 'highTG', 'highScore'))

gives this error:
Registered S3 method overwritten by 'seriation':
method from
reorder.hclust vegan
Error in if (!all(x >= 0)) stop("Negative distances not supported!") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In stats::cor(x = meta_mat, y = otu_mat, use = cor_use, method = cor) :
the standard deviation is zero

Can you help me? I cannot share my data because that is confidental, so I hope you have enough to help me.

Kind regards, Brigitte

tax_top function

like top_taxa but with options for how "top" is defined (e.g. prevalence/total/max/median)

useful in comp_barplot?

could not find function "sampleAnnotation" in v0.90

Hello,
Thanks for writing such a fantastic package. Gone are the days of manually extracting vegan objects and forcing them into ggplot!!

I'm following the heatmaps vignette and having this issue in v0.90. Is this a bug or user error (always more likely)?

Code:

psf %>%

glom to Order

tax_glom("Order") %>%
tax_transform("compositional") %>%
comp_heatmap(colors = heat_palette(palette = "Blue-Red"),
sample_anno = sampleAnnotation(
State1 = anno_sample("metformin"),
col = list(State1 = cols), border = FALSE)
)

sessionInfo:

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] RColorBrewer_1.1-2 ggvegan_0.1-0 ggpubr_0.4.0
[4] rstatix_0.7.0 dada2_1.20.0 Rcpp_1.0.8
[7] UpSetR_1.4.0 ComplexHeatmap_2.10.0 shiny_1.7.1
[10] microViz_0.9.0 phyloseq_1.38.0 forcats_0.5.1
[13] stringr_1.4.0 dplyr_1.0.8 purrr_0.3.4
[16] readr_2.1.2 tidyr_1.2.0 tibble_3.1.6
[19] ggplot2_3.3.5 tidyverse_1.3.1 vegan_2.5-7
[22] lattice_0.20-45 permute_0.9-7

loaded via a namespace (and not attached):
[1] utf8_1.2.2 tidyselect_1.1.1
[3] htmlwidgets_1.5.4 TSP_1.1-11
[5] BiocParallel_1.26.2 Rtsne_0.15
[7] devtools_2.4.3 munsell_0.5.0
[9] codetools_0.2-18 DT_0.20
[11] withr_2.4.3 colorspace_2.0-2
[13] Biobase_2.54.0 rstudioapi_0.13
[15] stats4_4.1.2 ggsignif_0.6.3
[17] MatrixGenerics_1.4.3 labeling_0.4.2
[19] GenomeInfoDbData_1.2.7 hwriter_1.3.2
[21] farver_2.1.0 rhdf5_2.38.0
[23] rprojroot_2.0.2 vctrs_0.3.8
[25] generics_0.1.2 R6_2.5.1
[27] doParallel_1.0.17 GenomeInfoDb_1.30.1
[29] clue_0.3-60 seriation_1.3.1
[31] isoband_0.2.5 bitops_1.0-7
[33] rhdf5filters_1.6.0 microbiome_1.16.0
[35] cachem_1.0.6 DelayedArray_0.18.0
[37] assertthat_0.2.1 promises_1.2.0.1
[39] scales_1.1.1 gtable_0.3.0
[41] Cairo_1.5-14 processx_3.5.2
[43] rlang_1.0.1 GlobalOptions_0.1.2
[45] splines_4.1.2 broom_0.7.12
[47] yaml_2.2.2 reshape2_1.4.4
[49] abind_1.4-5 modelr_0.1.8
[51] crosstalk_1.2.0 backports_1.4.1
[53] httpuv_1.6.5 tools_4.1.2
[55] usethis_2.1.5 ellipsis_0.3.2
[57] jquerylib_0.1.4 biomformat_1.22.0
[59] BiocGenerics_0.40.0 sessioninfo_1.2.2
[61] plyr_1.8.6 zlibbioc_1.40.0
[63] RCurl_1.98-1.6 ps_1.6.0
[65] prettyunits_1.1.1 GetoptLong_1.0.5
[67] S4Vectors_0.32.3 SummarizedExperiment_1.22.0
[69] haven_2.4.3 ggrepel_0.9.1
[71] cluster_2.1.2 fs_1.5.2
[73] magrittr_2.0.2 data.table_1.14.2
[75] circlize_0.4.14 reprex_2.0.1
[77] matrixStats_0.61.0 pkgload_1.2.4
[79] hms_1.1.1 mime_0.12
[81] xtable_1.8-4 jpeg_0.1-9
[83] readxl_1.3.1 IRanges_2.28.0
[85] gridExtra_2.3 shape_1.4.6
[87] testthat_3.1.2 compiler_4.1.2
[89] crayon_1.4.2 htmltools_0.5.2
[91] mgcv_1.8-38 later_1.3.0
[93] tzdb_0.2.0 RcppParallel_5.1.5
[95] lubridate_1.8.0 DBI_1.1.2
[97] dbplyr_2.1.1 MASS_7.3-55
[99] ShortRead_1.50.0 Matrix_1.4-0
[101] ade4_1.7-18 car_3.0-12
[103] brio_1.1.3 cli_3.1.1
[105] parallel_4.1.2 igraph_1.2.11
[107] GenomicRanges_1.44.0 pkgconfig_2.0.3
[109] GenomicAlignments_1.28.0 registry_0.5-1
[111] xml2_1.3.3 foreach_1.5.2
[113] bslib_0.3.1 multtest_2.50.0
[115] XVector_0.34.0 rvest_1.0.2
[117] callr_3.7.0 digest_0.6.29
[119] Biostrings_2.62.0 cellranger_1.1.0
[121] curl_4.3.2 Rsamtools_2.8.0
[123] rjson_0.2.21 lifecycle_1.0.1
[125] nlme_3.1-155 jsonlite_1.7.3
[127] Rhdf5lib_1.16.0 carData_3.0-5
[129] desc_1.4.0 fansi_1.0.2
[131] pillar_1.7.0 fastmap_1.1.0
[133] httr_1.4.2 pkgbuild_1.3.1
[135] survival_3.2-13 glue_1.6.1
[137] remotes_2.4.2 png_0.1-7
[139] iterators_1.0.14 stringi_1.7.6
[141] sass_0.4.0 latticeExtra_0.6-29
[143] memoise_2.0.1 ape_5.6-1

Heatmap error: heat_palette() palette argument must be either a: vector of colours, or....

Hi,

I'm trying the example heatmap code on my study,

cols <- distinct_palette(n = 3, add = NA)
names(cols) <- unique(samdat_tbl(psq)$DiseaseState)

psq %>% 
  tax_transform("compositional", rank = "Class") %>% 
  comp_heatmap(
    tax_anno = taxAnnotation(
      Prev. = anno_tax_prev(bar_width = 0.3, size = grid::unit(1, "cm"))
    ),
    sample_anno = sampleAnnotation(
      State1 = anno_sample("DiseaseState"),
      col = list(State1 = cols), border = FALSE,
      State2 = anno_sample_cat("DiseaseState", col = cols)
    )
  )

But I kept getting error says:

Error: 
heat_palette() palette argument must be either a:
- vector of colours, or
- palette name from colorspace::hcl_palettes(type = 'diverging')
- palette name from colorspace::divergingx_palettes()
- palette name from colorspace::hcl_palettes(type = 'sequential')

After I added
colors = heat_palette(sym = TRUE) after the "sampleAnnotation()" part in comp_heatmap(), the error is gone.

Is this line necessary here or did I misunderstood anything?

Thanks!
Leran

labeling issue with comp_barplot() and facet_grid()

I am having trouble getting proper labeling of my x axis when making a bar plot. I am faceting by timepoint, and since I have some subjects who do not have samples for every timepoint, I want there to be a blank space if there is a missing sample, so that you can still read across the faceted plots and easily follow each subject over time.

It seems to me like facet_grid() is the only function that will get me what I need in terms of having blank spaces for empty samples. (facet_wrap(), facet_nested(), and the built in "facet_by=" in comp_barplot() did not work).

However, if I use facet_grid() and specify aes(x=Subject), the subject numbers are not appearing on the axis. If I comment out the aes() and instead use "label=" in comp_barplot(), this results in a hugely messy graph because it repeats each subject on the axis.

Not sure I explained that well. Here is my code, and an example of the plot using aes() specifications and using "label="
specifications.

genusplot <- phygenus_plot %>%
comp_barplot(
tax_level = "Genus",
n_taxa = 24,
label = "Subject",
sample_order = samples
) +
geom_bar(stat = "identity")+
coord_flip() +
facet_grid(~Timepoint, scales = "free") +
#aes(x = Subject) +
guides(fill = guide_legend(reverse = TRUE, keywidth = 0.4, keyheight = 0.4, nrow=5)) +
labs(y = "Relative Abundance", fill = "Genus") +
theme_classic() +
theme(plot.title = element_text(hjust=0.5, size = 15),
axis.text.x = element_text(size = 9, hjust = 0.5, colour=1),
axis.text.y = element_text(size = 9, colour=1),
axis.title.x = element_text(size=13),
axis.title.y = element_text(size=13),
legend.text = element_text(size = 8),
legend.key.size = unit(0.4, "cm"),
legend.key.width = unit(0.4,"cm"),
legend.position = "bottom",
strip.text.x = element_text(size=10),
strip.text.y = element_text(size=10),
panel.border = element_rect(colour = "black", fill=NA),
axis.text = element_text(colour = 1, size = 9)
)
genusplot

corncob via cran

available on cran now
suggested still?

girafe_html

Hello - can't wait to get to work using this package. Thank you for offering it!

I'm able to initiate a new browser window (in firefox or safari) but when the new window comes up, clicking the edit menu produces no options. My meta data are available in drop down boxes, but there are no images in the view window.

When I stop running the command in Rstudio it provides an error with traceback as follows:

Listening on http://127...
object 'girafe_html' not found

| execCallbacks(timeoutSecs, all, loop$id)
| run_now(timeoutMs/1000, all = FALSE)
| service(timeout)
| serviceApp()
| ..stacktracefloor..(serviceApp())
| withCallingHandlers(expr, error = doCaptureStack)
| domain$wrapSync(expr)
| promises::with_promise_domain(createStackTracePromiseDomain(), expr)
| captureStackTraces({ while (!.globals$stopped) { ..stacktracefloor..(serviceApp()) } ...
| ..stacktraceoff..(captureStackTraces({ while (!.globals$stopped) { ..stacktracefloor..(serviceApp()) } ...
| runApp(x)
| print.shiny.appobj(x)
| (function (x, ...) UseMethod("print"))(x)

Thank you for any insights you may be able to provide!

Andrew Gustin

ps_calc_dominant - for add dominant taxon for each sample to phyloseq sample_data

brainwave generated with the help of @vitorheidrich

see relevant discussion #19

consider similar functions that calculate useful sample_data:

ps_calc_enterotype or ps_calc_cluster
ps_calc_alpha (using probably microbiome::alpha or related function)

Error in if (!all(x >= 0)) stop("Negative distances not supported!") : missing value where TRUE/FALSE needed

Hi,

I used a subset of phyloseq object in comp_barplot(), but got error:

Error in if (!all(x >= 0)) stop("Negative distances not supported!") : missing value where TRUE/FALSE needed

I'm not sure which step of comp_barplot generated this error. Is there any way to fix this?

Thanks!
Leran

better explanation of dist and ord theory in fun help pages

links to specific pages on gusta.me
link to permanova book chapter

vignette for package ?microViz

Can I use microViz with only relative abundance

ps_counts(input, warn = "error") : otu_table of counts is NOT available! Available otu_table contains non-zero values that are less than 1

Hierarchical clustering in relative abundances plots

Hello David,

In the correlation heatmaps from the microViz package, hierarchical clustering using euclidean distances is shown on top/next the heatmap.
Is it also possible to include the hierarchical clustering in the relative abundances plots? And would it be possible to get the clusters with the use of the Bray-Curtis dissimilarity (which is already used for the order of the relative abundances plot)?

Kind regards,
Brigitte

Feature request - order taxa in comp_barplot alphabetically

Hello again -
The default behavior of a ggplot2 barplot on melted OTU table is to order the taxa in the legend alphabetically, which (IMO) makes this much easier to read. Can this be added as an option in the comp_barplot, or is it possible to add as a ggplot2 layer? Thanks!

Example of ordered plot in ggplot2 with alphabetically ordered taxa in the legend-

Example of default comp_barplot -

Sorting taxa by abundance per sample in comp_barplot

Hi!

First of all, thank you for creating and sharing this great package, it makes plot generation a lot more easy and convenient.

I have a question regarding the comp_barplot function, more specifically the tax_order argument. Is there any way to sort the lets say 10 taxa by their abundance in each sample, by example, the most abundant taxa would be at the bottom, second most abundant taxa above that and so forth, with optionally the "Other" taxa being at the top, regardless of its abundance.
Unfortunately i could not achieve this with the suggested sorting methods.

Sorry if i have missed or misunderstood something in the documentation.

umap for ord_plot

via uwot
as suggested

and tsne?

tax_info convenience function?

Plan:

tax_info(ps, tax_names = NA, tax_select = NA, tibble = NA, ranks = FALSE, undetected = 0)

provide either

tax_names, 1 or vector
tax_select, 1 or vector for tax_select regex interface (default of 1 typo probably fine)

returns:

(some) ranks
prev (using prev_calc, respecting undetected arg)
min, mean, max etc.?

return format (for tibble = NA):

if one taxon, cat (unless tibble = TRUE)
if multiple, return tbl, unless tibble = FALSE

tax_model support multiple univariate tests?

e.g. wilcox rank-sum
should correlation also be supported here? e.g. spearman's rank

ord_explore plotting error with R 4.1 on windows

The solution currently involves installing GitHub version of ggiraph, which includes Windows support for new R graphics engine.

devtools::install_github('davidgohel/ggiraph')

cran submission to-do list

check licence is right
check other cran requirements
update this

comp_barplot groups in "other" category

works when bar outlines drawn only
not useful in dense iris plots?
another way to indicate diversity?
- internal diversity heatmap ring could work for iris plot?
- and just counting observed taxa wouldn't need helper funs

Error in (function (edges, n = max(edges), directed = TRUE) : At type_indexededgelist.c:117 : cannot create empty graph with negative number of vertices, Invalid value

Hi,

I encountered this error when running taxatree_plots() function on my lm model. I checked around and it seems that the it is looking for numeric edge format but being provided with other format. But I'm not sure where to adjust my edge format to numeric? And I didn't encounter this error when I run this function in other datasets.

Thanks!
Leran

p values in correlation heatmaps

Hello David,

I want to ask another question; how can I show the p-values of the spearman correlation in the correlation heatmap?

tax_model checks equality with corncob::bbdml without checking if corncob installed...

should be fixed asap, check for corncob before checking equality

How to understand the "grouping" parameter in taxatree_stats_p_adjust() function?

Hi,

I have a questions regarding to the "grouping" parameter in taxatree_stats_p_adjust() function. I can find there is explanation in tutorial page:

Define how to group the p values for adjustment with the grouping argument. The default is to adjust the p values in groups at each taxonomic rank, but you could also adjust per "model" / "taxon" or per "term". Or even group by a combination of rank and term with c("rank", "term")

But still feel confused. Does "groups at each taxonomic rank" mean it adjusts p values based on the number of unique taxa in phylum level (for example)? And does "group by taxon" mean that the calculation is based on the number of entire taxonomy?

And is there an instruction on how to choose grouping parameter?

Thanks so much!
Leran

dist_calc() Error

Hi,

I was trying to run below command:

ps0.UC.CD.HHC.fix %>% 
  tax_transform(trans = "clr", rank = "Genus") %>% 
  dist_calc("wunifrac")

And I got this error message:

unifrac distances require un-aggregated taxa and a phylogenetic tree.
 Show Traceback
Error in access(physeq, "phy_tree", errorIfNULL) : phy_tree slot is empty.

I'm sure I have a tree in the phyloseq object:

ps0.UC.CD.HHC.fix
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 5284 taxa and 518 samples ]
sample_data() Sample Data:       [ 518 samples by 10 sample variables ]
tax_table()   Taxonomy Table:    [ 5284 taxa by 7 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 5284 tips and 5282 internal nodes ]

Could you let me know how to fix this issue?

Thanks so much!
Leran

> clean_Bacteria %>% 
+   tax_fix() %>% 
+   #ps_arrange("Treatments") %>% 
+   tax_agg("Class") %>% 
+   cor_heatmap(vars = c("p_h", "om", "t_o_c", "c_n_ratio", "nitrate_n_ppm", "clay"),
+               cor = "spearman")


Error in if (!all(x >= 0)) stop("Negative distances not supported!") : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In stats::cor(x = meta_mat, y = otu_mat, use = cor_use, method = cor) :
  the standard deviation is zero

david-barnett / microviz Goto Github PK

microviz's People

Stargazers

Watchers

Forkers

microviz's Issues

and a model2stats general function?

glom to Order

Plan:

Recommend Projects

Recommend Topics

Recommend Org