leffj / mctoolsr Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 8.0 6.08 MB

Microbial community analysis tools in R

Home Page: http://leffj.github.io/mctoolsr/

R 13.98% HTML 86.02%

mctoolsr's People

Contributors

Stargazers

Watchers

Forkers

iueayhu giriartes jdiaz4302 taramclancy padpadpadpad rajaldebnath jonahventures zhanxw

mctoolsr's Issues

add warning if filtering taxon that doesnt exist from table

pull taxonomic classifications when loading text format otu table

Alpha diversity analyses

Need to add alpha diversity analysis functions. Include calculate and plot functions for multiple metrics.

Does calc_dm() relativize data before calculating bray-curtis?

Just checking, but based on the source code I found for mctoolsr, I don't think calc_dm() currently relativizes the species matrix before calculating bray-curtis. I wanted to make sure I wasn't misreading the bit of code I found. Should we be calculating relative abundance via decostand before running calc_dm in this case?

Thanks,

Alex

export otu table bug

if sample id starts with a number, adds X to front of sample id (only otu table, mapping file exports fine)

Issue loading data

I get an error every time I load the biom table and map.

Downloading GitHub repo leffj/mctoolsr@master
from URL https://api.github.com/repos/leffj/mctoolsr/zipball/master
Installing mctoolsr
'/Library/Frameworks/R.framework/Resources/bin/R'
--no-site-file --no-environ --no-save
--no-restore --quiet CMD INSTALL
'/private/var/folders/bt/j1th0gjn1mbdgcr1xh1x3bgw0000gn/T/RtmpmPeLAW/devtools1f2161915170/leffj-mctoolsr-2265f34'
--library='/Library/Frameworks/R.framework/Versions/3.3/Resources/library'
--install-tests

installing source package ‘mctoolsr’ ...
** R
** data
*** moving datasets to lazyload DB
** inst
** tests
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
DONE (mctoolsr)
Reloading installed mctoolsr
You're using mctoolsr (v.0.1.1.0). Direct inquiries to:
'https://github.com/leffj/mctoolsr'

Attaching package: ‘mctoolsr’

The following object is masked from ‘package:phyloseq’:

plot_ordination

tax=system.file('fruits_veggies_taxa_table_wTax.biom', package = 'mctoolsr')
mapp=system.file('fruits_veggies_metadata.txt',package = 'mctoolsr')
inp=load_taxa_table(tax, mapp)
Error in load_taxa_table(tax, mapp) :
Input file must be either biom (.biom) or tab-delimited (.txt) format.

Any suggestions as to what may be going wrong will help.

R version:

version
_
platform x86_64-apple-darwin13.4.0
arch x86_64
os darwin13.4.0
system x86_64, darwin13.4.0
status
major 3
minor 3.2
year 2016
month 10
day 31
svn rev 71607
language R
version.string R version 3.3.2 (2016-10-31)
nickname Sincere Pumpkin Patch

Thanks!

Error with calc_pairwise_permanovas

Can you help me with this? I am not sure why this is happening. Thanks

> mctoolsr::calc_pairwise_permanovas(t(otu_ITS), metadata_ITS, "Site")

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

> str(metadata_ITS)
'data.frame':	72 obs. of  8 variables:
 $ barcodenumb: int  417 431 425 388 428 463 415 438 448 429 ...
 $ Site     : Factor w/ 4 levels "Ciba","Longjin",..: 3 3 4 1 4 2 3 4 4 4 ...
 $ Description: Factor w/ 72 levels "T10a-in","T10a-out",..: 63 65 71 35 68 24 50 2 8 67 ...
 $ is.neg     : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...

There is no NA or NaN in the metadata or the otus files.

Gian

Mac Error Dont know how to decompress with extension 4-0

Im trying to install mctoolsr and i continue to get this error message?

Error: Don't know how to decompress files with extension 4-0

sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
Running under: OS X 10.12.1 (unknown)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] devtools_1.12.0

loaded via a namespace (and not attached):
[1] httr_1.2.1 R6_2.2.0 tools_3.2.1 withr_1.0.2 curl_2.3
[6] memoise_1.0.0 digest_0.6.10

add warning if missing a column in mapping file

warning or error if missing a column in imported mapping file

Add 95% confidence interval ellipses to NMDS

Potentially useful alternative to hulls, which encompass all the data points within a factor level. See ordihull functions in vegan.

calc_pairwise_permanovas error#1

Error in combn(levels(comp_var), 2) : n < m

What does it mean?

problems loading data when mapping file contains less than 3 columns

Filter by multiple sample categories

Say you have two mapping categories, Fruit (with apple, pear, banana as levels) and Root (potato, carrot as levels). It would be nice to filter to keep only "Pear" and "Carrot" samples with one command. ATM you can only use one filter_cat.

add functionality to `export_otu_table()`

be able to export mapping file with otu table
be able to export otu table from data frame format

Mantel functionality

Both plotting distances as well as calculating correlations.
Would be useful to correlate subsets of the distance matrix against one another, and also to correlate the distance matrix against metadata in the mapping file. For the latter I suppose a new DM would have to be calculated for the metadata.

The condition has length > 1 and only the first element will be used (load_taxa_table)

Hi,
I am running the function load_taxa_table using tab delimited taxa and metadata tables and I get the following:

the condition has length > 1 and only the first element will be used

My tables are formatted (to the best of my knowledge) just like the example files provided with the mctoolsr package. When I run the command using these files, like so input <- load_taxa_table("fruits_veggies_taxa_table_wTax.txt", "fruits_veggies_metadata.txt") I do not get this error, so something is wrong with my input files. I cannot for the life of me figure out what the problem is.

Any ideas? Thanks

add informative error messages if no taxonomy and if filepath doesn't exist

accommodate variability in number of comment lines when loading taxa tables

For other software (eg qiime), the taxa tables can have multiple comment lines and this can mess up the loading in mctoolsr. Look into this.

n_perm on calc_pairwise_permanovas()

Firstly, thank you for making the package calc_pairwise_permanovas() is awesome! I just had a problem where only 999 permutations was giving me non-reproducible results for which pairs were significantly different from each other.

Was wondering if it would be worth allowing the user to specify more permutations if they wanted. Would take nearly hardly any changes. I have already done it but would be nice to just update the package here.

calc_pairwise_permanovas <- function (dm, metadata_map, compare_header, n_perm) 
{
  comp_var = metadata_map[, compare_header]
  comp_pairs = combn(levels(comp_var), 2)
  pval = c()
  R2 = c()
  for (i in 1:ncol(comp_pairs)) {
    pair = comp_pairs[, i]
    dm_w_map = list(dm_loaded = dm, map_loaded = metadata_map)
    dm_w_map$map_loaded$in_pair = comp_var %in% pair
    dm_w_map_filt = mctoolsr::filter_dm(dm_w_map, filter_cat = "in_pair", 
                              keep_vals = TRUE)
    m = vegan::adonis(dm_w_map_filt$dm_loaded ~ dm_w_map_filt$map_loaded[, compare_header], permutations = n_perm)
    pval = c(pval, m$aov.tab$`Pr(>F)`[1])
    R2 = c(R2, m$aov.tab$R2[1])
  }
  results = data.frame(t(comp_pairs), R2, pval)
  results$pvalBon = pval * length(pval)
  results$pvalFDR = round(pval * (length(pval)/rank(pval, ties.method = "average")), 
                          3)
  results
}

Let me know and I can do a pull request if thats easiest.

Cheers
Dan

Remove OTUs by OTU ID

Something like filter_taxa_from_data( ) for specific OTUs

Installation issue

Hi I am trying to install this package in new version of Rstudio and getting error as memory.limit() has been discontinued any suggestion?

devtools::install_github("leffj/mctoolsr")
Downloading GitHub repo leffj/mctoolsr@HEAD
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: rlang (1.1.2 -> 1.1.3 ) [CRAN]
5: glue (1.6.2 -> 1.7.0 ) [CRAN]
6: cli (3.6.1 -> 3.6.2 ) [CRAN]
7: vctrs (0.6.4 -> 0.6.5 ) [CRAN]
8: stringi (1.8.2 -> 1.8.3 ) [CRAN]
9: Rcpp (1.0.11 -> 1.0.12) [CRAN]
10: fansi (1.0.5 -> 1.0.6 ) [CRAN]
11: Rhdf5lib (1.24.0 -> 1.24.1) [CRAN]
12: scales (1.2.1 -> 1.3.0 ) [CRAN]
13: rhdf5 (2.46.0 -> 2.46.1) [CRAN]
14: jsonlite (1.8.7 -> 1.8.8 ) [CRAN]

Enter one or more numbers, or an empty line to skip updates:
── R CMD build ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ checking for file 'C:\Users\00113286\AppData\Local\Temp\RtmpaSPtIj\remotes4134403a5a4\leffj-mctoolsr-c7c9404/DESCRIPTION'
─ preparing 'mctoolsr': (540ms)
✔ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
Removed empty directory 'mctoolsr/vignettes'
─ building 'mctoolsr_0.1.1.9.tar.gz'

Warning message:
'memory.limit()' is no longer supported
Error: could not find function "memory.limit"
Execution halted
Warning message:
In i.p(...) :
installation of package ‘C:/Users/AppData/Local/Temp/RtmpaSPtIj/file41345c48282f/mctoolsr_0.1.1.9.tar.gz’ had non-zero exit status

pool samples function

Function to combine/pool samples for analysis in r at the 'input' stage-- (e.g. if I have sampleA and sampleA_duplicate that were run in two different wells w/ different barcodes and want to treat them as 1 sample)

removing OTUs from controls

Hi,
I would like to remove the OTUs within my negative control from all my data. Is there a possibility to do this in mctoolsr?
And another question, can one somehow implemented normalized data from metagenomeSeq?

Best regards!

filter by OTU criteria

Filter by abundance (Here, I filter to only OTUs that have at least 25 observations)

input_rar_filt2 = rowSums(input_rar_filt$data_loaded)
class(input_rar_filt2)
input_rar_filt3 = input_rar_filt$data_loaded[input_rar_filt2 >= 25, ]

Filter by number of samples observed in (Here, I filter to only OTUs found in at least 5 samples)

input_rar_filt4 = rowSums(input_rar_filt3 [,] > 1) *1
input_rar_filt5 = input_rar_filt3 [input_rar_filt4 >= 5,]

Add Spearman's correlation?

This isn't a problem with existing code, just as suggestion to add Spearman's correlation to the Taxa-Based Exploration section. Thanks.

export_otu_table

change name to export_taxa_table

minor figure edits

Is there a way to alter the orientation of the relative abundances on the heatmap?

Is there a way to change the color scheme of the plot_taxa_bars command?

Modify calc_mean_dissimilarities to resolve factor-mismatch bug

I was getting the following error when using the calc_mean_dissimilarities function:

Error in Ops.factor(dm_clmns_wCat[, 4], dm_clmns_wCat[, 5]) : level sets of factors are different

After some sleuthing I figured out that this bug appears after the add_metadata_to_dm_clmns function converts the factor category into two factors. If the summarize_by_factor category contains levels with only one sample, and if that sample happens to be listed first in the distance matrix, then since there are no "self" comparisons, there will be a mismatch in the number of factor levels between the 4th and 5th columns of dm_clmns_wCat. This can easily be solved by converting those factors into characters before reducing the dataframe. Something like this:

dm_clmns = convert_dm_to_3_column(dm) dm_clmns_wCat = add_metadata_to_dm_clmns(dm_clmns, metadata_map, summarize_by_factor) # change categories to factor so that I don't get the factor mis-align error dm_clmns_wCat[, 4] <- as.character(dm_clmns_wCat[, 4]) dm_clmns_wCat[, 5] <- as.character(dm_clmns_wCat[, 5]) dm_clmns_wCat = dm_clmns_wCat[!is.na(dm_clmns_wCat[, 4]) & !is.na(dm_clmns_wCat[, 5]), ] dm_clmns_wCat_reduced = dm_clmns_wCat[dm_clmns_wCat[, 4] != dm_clmns_wCat[, 5], ]

biomformat update issue R v3.3.3

I'm having trouble updating mctoolsr, the issue seems to be a conflict with bioconductor's version of the biomformat package and R version 3.3.3.

`Downloading GitHub repo leffj/mctoolsr@master
from URL https://api.github.com/repos/leffj/mctoolsr/zipball/master
Installing mctoolsr
'/usr/lib64/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL
'/tmp/RtmpkqqHZi/devtools4a08550199e8/leffj-mctoolsr-2265f34'
--library='/home/hannah/R/x86_64-redhat-linux-gnu-library/3.3' --install-tests

ERROR: dependency ‘biomformat’ is not available for package ‘mctoolsr’

removing ‘/home/hannah/R/x86_64-redhat-linux-gnu-library/3.3/mctoolsr’
Error: Command failed (1)`

I tried following this solution to install the development version of biomformat as specified in this discussion before updating mctoolsr and it did not work. Any advice would be appreciated.

https://groups.google.com/forum/#!topic/qiime-forum/_TVPiqabwvs

install.packages("devtools") # if not already installed devtools::install_github("biom", "joey711")

Pairwise PERMANOVA tests

Would be useful to be able to follow up a significant main effect in adonis, to see which factor levels differ from which others. Could be done by multiple pairwise permanovas applying FDR correction.

mctoolsr not working after R update

I've recently updated R and R Studio to the latest versions, however the exact same lines of code that I ran before updating are now giving errors as if mctoolsr is now having trouble reading my .txt files properly when I try using the load_taxa_table command. I did need to reinstall all of my packages, including mctoolsr, after updating so I am not sure if this is an installation issue or an issue with my files. I've tried quitting and restarting R to no avail, is there anything else I can try so I won't get these errors and I can run my code as normal?

This is the error I get running the same code and files that I have previously:

input_r <- load_taxa_table("data_16S.txt", "metadata.txt")
Error in if (class(tmp) == "list") { : the condition has length > 1

I thought there may be an issue with the # in the first line, so I made another file removing the hashtag in the OTU table file and re-ran the code, then received this error instead:

Warning in load_taxa_table("data_16S.txt", "metadata.txt") :
No taxonomy loaded. If taxonomy should have been loaded, check that "taxonomy" header exists.
0 samples loaded

.biom vs .txt loading w/euk SILVA classifications

loading taxa table + mapping file as input works with .biom output but not .txt output for taxonomy from SILVA

error message with text file (and works fine with same file in biom format):

116 samples loaded
Warning message:
In .parse_taxonomy(data$taxonomy) :
Issue with provided taxonomy. Check for quotes near line 2. Proceeding without taxonomy.Issue with provided taxonomy. Check for quotes near line 9. [... truncated]

Calculation of OTU-level overlap

This could be useful in tracking sources. For example, to ask what percent of the OTUs found in a potential "sink" category (e.g., door handles), were present in a potential source category (e.g., hands)? -> "90% of the OTUs found on door handles were also present on hands"

It might be additionally helpful to filter by OTU relative abundance before the calculation, for the sink or source or both. -> "95% of the dominant OTUs (>2% relative abundance) on door handles ..."