Code Monkey home page Code Monkey logo

patroon's Introduction

patRoon

CircleCI btd codecov patRoon r-universe DOI:10.1186/s13321-020-00477-w DOI

patRoon aims to provide comprehensive mass spectrometry based non-target analysis (NTA) workflows for environmental analysis. The name is derived from a Dutch word that means pattern and may also be an acronym for hyPhenated mAss specTROmetry nOn-target aNalysis.

Project news

December 2023 patRoon 2.3.1 is now released. This is a small maintenance release with several bug fixes. The most important fix is a correction for the prediction of concentrations from SIRIUS fingerprints. Please see the Project NEWS for more details and important notes on updating patRoon.

November 2023 patRoon 2.3 is now released. The most significant changes include improved installation strategies and integration of MS2Tox and MS2Quant to predict feature toxicities and concentrations. Please see the Project NEWS for more details and important notes on updating patRoon.

May 2023 patRoon 2.2 is now released. The most significant change is the addition of a new reporting interface, which brings a much improved HTML interface, many optimizations and other important new functionality. Furthermore, patRoon 2.2 introduces improved SIRIUS 5 support, a new TP screening algorithm using formula libraries and many other improvements, of which many thanks to the great user feedback. Please see the Project NEWS for details.

March 2023 The Docker images moved to a new host. Please see the see installation details in the handbook to obtain the latest images.

May 2022 patRoon 2.1 is now available. This new release integrates prediction of transformation products with CTS, adds several feature intensity normalization methods, adds new functionality and improvements for reporting TP data and supports loading, processing and annotation with MS libraries such as MassBank. Please see the Project NEWS for details.

Introduction

Mass spectrometry based non-target analysis is used to screen large numbers of chemicals simultaneously. For this purpose, high resolution mass spectrometry instruments are used which are typically coupled (or hyphenated) with chromatography (e.g. LC or GC). The size and complexity of resulting data makes manual processing impractical. Many software tools were/are developed to facilitate a more automated approach. However, these tools are generally not optimized for environmental workflows and/or only implement parts of the functionality required.

patRoon combines established software tools with novel functionality in order to provide comprehensive NTA workflows. The different algorithms are provided through a consistent interface, which removes the need to know all the details of each individual software tool and performing tedious data conversions during the workflow. The table below outlines the major functionality of patRoon.

Functionality Description Algorithms
Raw data pre-treatment MS format conversion (e.g. vendor to mzML) and calibration. ProteoWizard, OpenMS, DataAnalysis
Feature extraction Finding features and grouping them across analyses. XCMS, OpenMS, enviPick, DataAnalysis, KPIC2, SIRIUS, SAFD
Suspect screening Finding features with suspected presence by MS and chromatographic data. Estimation of identification confidence levels. Native
MS data extraction Automatic extraction and averaging of feature MS(/MS) peak lists. Native, mzR, DataAnalysis
Formula annotation Automatic calculation of formula candidates for features. GenForm, SIRIUS, DataAnalysis
Compound annotation Automatic (in silico) compound annotation of features. MetFrag, SIRIUS, Native
Componentization & adduct annotation Grouping of related features based on chemistry (e.g. isotopes, adducts and homologs), hierarchical clustering or MS/MS similarity into components. Using adduct and isotope annotations for prioritizing features and improving formula/compound annotations. RAMClustR, CAMERA, nontarget R package, OpenMS, cliqueMS, Native
Combining algorithms Combine data from different algorithms (e.g. features, annotations) and generate a consensus. Native
Sets workflows Simultaneous processing and combining +/- MS ionization data Native
Transformation product (TP) screening Automatic screening of TPs using library/in-silico data, MS similarities and classifications. Tools to improve compound TP annotation. BioTransformer, PubChemLite, Native
Reporting Automatic reporting of all important workflow data. An example HTML report can be viewed here. Native
Data clean-up & prioritization Filters for blanks, replicates, intensity thresholds, neutral losses, annotation scores, identification levels and many more. Native
Data curation Several graphical interactive tools and functions to inspect and remove unwanted data. Native

The workflow of non-target analysis typically depends on the aims and requirements of the study and the instrumentation and methodology used for sample analysis. For this reason, patRoon does not enforce a certain workflow. Instead, most workflow steps are optional, fully configurable and algorithms can easily be mixed or even combined.

Implementation details

  • patRoon is implemented as an R package, which allows easy interfacing with the many other R based MS tools and other data processing functionality from R.
  • Fully open-source (GPLv3).
  • Developed on Windows, Linux and macOS
  • S4 classes and generics are used to implement a consistent interface to all supported algorithms.
  • Continuous integration is used to automatically perform unit tests, update the Website and documentation, and maintaining installation resources such as the patRoonDeps repository, Docker image and patRoon bundle (see the handbook for more details).
  • Supports all major instrument vendor input formats (through usage of ProteoWizard and DataAnalysis).
  • Optimizations
    • data.table is used internally as a generally much more efficient alternative to data.frame.
    • The processx and future R packages are used for parallelization.
    • Results from workflow steps are cached within a SQLite database to avoid repeated computations.
    • Code for loading MS and EIC data, MS similarity calculations and others were implemented in C++ to reduce computational times.
  • The RDCOMClient R package is used to interface with Bruker DataAnalysis algorithms.
  • The Shiny R package was used to implement several GUI tools.
  • The reporting functionality relies on the excellent R markdown and related packages such as flexdashboard, bslib and reactable.

Installation

patRoon and its dependencies can be installed in various ways. Please see the installation section in the handbook for more information.

Getting started

For a very quick start:

library(patRoon)
newProject()

The newProject() function will pop-up a dialog screen (requires R Studio), which will allow you to quickly select the analyses and common workflow options to subsequently generate a template R processing script.

However, for a better guide to get started it is recommended to read the tutorial. Afterwards the handbook is a recommended read if you want to know more about advanced usage of patRoon. Finally, the reference outlines all the details of the patRoon package.

Citing

When you use patRoon please cite its publications:

Rick Helmus, Thomas L. ter Laak, Annemarie P. van Wezel, Pim de Voogt and Emma L. Schymanski. patRoon: open source software platform for environmental mass spectrometry based non-target screening. Journal of Cheminformatics 13, 1 (2021)

Rick Helmus, Bas van de Velde, Andrea M. Brunner, Thomas L. ter Laak, Annemarie P. van Wezel and Emma L. Schymanski. patRoon 2.0: Improved non-target analysis workflows including automated transformation product screening. Journal of Open Source Software, 7(71), 4029

patRoon builds on many open-source software tools and open data sources. Therefore, it is important to also cite their work when using these algorithms via patRoon.

Contributing

For bug reports, code contributions (pull requests), questions, suggestions and general feedback please use the GitHub page.

patroon's People

Contributors

coltonlloyd avatar egonw avatar leonsaal avatar rhelmus avatar ricardo-cunha avatar rickhelmus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

patroon's Issues

Using convertMSFiles

I am working with Waters .raw files and wondered if i could use this to convert from .raw to mzML all in one pipeline. However, i am struggling. I have tried two ways, one where i use the example data to to convert files in the files location substituting thermo for waters. The files have already been centroided using masslynx accuratemass but i have also gotten the raw data and tried to centroid as well with no luck. I typically get the following error "Error in files[keepFiles] : invalid subscript type 'list'"
The script i have used is as follows presuming that all files in folder will be converted to mzML and put into the centroided folder.
convertMSFiles(
files="Data/2-Raw",
outPath = "Data/Centroided",
dirs = TRUE,
from ="waters",
to = "mzML")

I am obviously missing something. Any help would be greatly appreciated.

[JOSS Review] State of the field: Do the authors describe how this software compares to other commonly-used packages?

The paper lacks a section giving an overview of the field and the actually used methods (short description of BioTransformer). Currently, this is somewhat addressed with Table 1 and the Statement of Need, but it is not a clear review of the state of the art.

If there is a reason for merging these two sections, please let me know and we can discuss things here 😄.

There exist multiple publications on the prediction of transformation products and reaction pathways (10.1021/acs.chemrestox.0c00224; 10.1021/ci200542m; 10.1016/j.ymben.2021.11.009;) - it would be great if the section could give a small overview and argue why BioTransformer was chosen in particular.

Dependency broken - off CRAN

As of Feb 25, 2022, nontarget is no longer available from CRAN, which means patRoon cannot be installed on non-Windows installations of R; and only on Windows if the miniCRAN repository keeps an archived version of nontarget handy.

So I guess figure out how critical this package is, and either strip it, or embed it? Sorry to be a moving target, but dependency hell is the worst thing about the R package ecosystem.

[JOSS Review] Bug: `findFeaturesXCMS` and `findFeaturesXCMS3` fail on Windows

Finding features with XCMS fails on windows since BiocParallel can't be easily installed on Windows with a default R version - only some combinations work apparently. XCMS requires BiocParallel to be installed, so fails if not.

r$> features <- findFeaturesXCMS(centroid)
Verifying if your data is centroided...
Finding features with XCMS for 12 analyses ...
Error in loadNamespace(name) : there is no package called 'BiocParallel'
Calls: local ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted

XCMS3 fails also for some other reason - see the log below.

r$> features <- findFeaturesXCMS3(centroid)
Verifying if your data is centroided...
Finding features with XCMS for 12 analyses ...
Loading raw data...
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'X' in selecting a method for function 'bplapply': could not find function "featureData"

screenSuspects fatal error

Hello,
I have been trying to screen suspects with patRoon using the default list, my own suspect screening list, and code from the written tutorial. It seems that every time I run screenSuspects( ) function I run into a fatal error and have to start over. Unfortunately, this also means there is nothing in the error log regarding the issue.
Best,
Rachel

FindFeatures issue with example data

Hi,

I am just trying to follow the patRoon tutorial and keep getting the following error (below) when I run "findFeatures" with the patRoon example data. Have I loaded something in wrong? What might be causing this error? I am pretty new to patRoon and R, so any help is greatly appreciated! Thanks.

fList <- findFeatures(anaInfo, "openms", noiseThrInt = 1000, chromSNR = 3, chromFWHM = 5, minFWHM = 1, maxFWHM = 30)
Verifying if your data is centroided...
Finding features with OpenMS for 6 analyses ...
| | 0%Error in maybeRestartCommand(commandQueue[cmdInds], runningProcInfo[[pi]], :
Failed to run command 'C:/Program Files/OpenMS/bin/FeatureFinderMetabo.exe' with args: '-algorithm:common:noise_threshold_int 1000 -algorithm:common:chrom_peak_snr 3 -algorithm:common:chrom_fwhm 5 -algorithm:mtd:mass_error_ppm 10 -algorithm:mtd:reestimate_mt_sd true -algorithm:mtd:trace_termination_criterion sample_rate -algorithm:mtd:trace_termination_outliers 5 -algorithm:mtd:min_sample_rate 0.5 -algorithm:mtd:min_trace_length 3 -algorithm:mtd:max_trace_length -1 -algorithm:epd:width_filtering fixed -algorithm:epd:min_fwhm 1 -algorithm:epd:max_fwhm 30 -algorithm:ffm:local_rt_range 10 -algorithm:ffm:local_mz_range 6.5 -algorithm:ffm:isotope_filtering_model metabolites (5% RMS) -algorithm:ffm:mz_scoring_13C false -algorithm:ffm:use_smoothed_intensities true -algorithm:ffm:report_convex_hulls true -algorithm:epd:masstrace_snr_filtering false -in C:\Users\AppData\Local\R\win-library\4.2\patRoonData\extdata\pos/solvent-pos-1.mzML -out C:\Users\AppData\Local\Te

[JOSS Review] Update overview graphics

The main diagram showing the workflows could be extended with something closer to the code, showing the individual operations and data flow. I thought about something like this. If you agree, I can share the document and we can make a refined version of this, maybe also including links to the actual function documentation etc.

image

gsub: invalid regular expression '\',

Dear Rick Helmus,

I am a beginner using patRoon, i want to exercise processing MS data with this.

when i tried making new project, this message came

Warning: Error in gsub: invalid regular expression '', reason 'Trailing backslash'

Would you explain me ?

best regards,

Patroon issues

Hi,
I hope I could find some tips with the following issues:

  1. I have been trying to run
    formulas <- generateFormulas(fGroups, mslists, "genform", relMzDev = 5, adduct = "[M+H]+", elements = "CHNOPSClBrF",
    oc = FALSE, calculateFeatures = TRUE,
    featThresholdAnn = 0.75)
    but I resecie the follwong error:
    ''%Error in maybeRestartCommand(commandQueue[cmdInds], runningProcInfo[[pi]], :
    Failed to run command 'C:/Users/a34760/Documents/R/win-library/4.1/patRoon/bin//x64/GenForm.exe' with args: 'exist oei noref dbe cm ppm=5.000000 el=CHNOPSClBrF het ion=M+H ms=C:\Users\a34760\AppData\Local\Temp\RtmpywNZ3p\MSPList5080338636d3.txt m=100.112057 out=C:\Users\a34760\AppData\Local\Temp\RtmpywNZ3p\formulas50803f221388.txt msms=C:\Users\a34760\AppData\Local\Temp\RtmpywNZ3p\MSMSPList5080419b2fa4.txt analyze'. Exit code: 1''

  2. I am using a computer with 16 GB RAM and I have specified 14 GB RAM for R but still when it comes to "Loading all MS peak" step, it takes quite a long time, and I decided to include only 6 samples in the (three samples and thee blanks), I wonder if you have a tip to include more sample in the analysis.

Thanks a lot.
Aasim

merging two featuresXCMS objects

Hi,
first thanks for your work, it looks very promising!
I started testing your app with 70 mzML data files (>2 Gb each), by running xcms through findFeatures() function. Since 70 files are too large to handle by my computer CPU and RAM, I was trying to run it only by groups of ~6 samples.
Is there an easy way of merging all the featuresXCMS objects created? I am not really used to S4 objects...
Thanks,
Julien

featureGrouping error

patRoon 1.2.0
xcms 3.8.2

Hi Rick
I was analysing 350 samples (~70GB) w/ xcms3.
During feature grouping (method: PeakDensity) I received this error message:
Error in result_bind(res@ptr, params): long vectors not supported yet: ../include/Rinlinedfuns.h:522

Is this an xcms or patRoon issue and how can it be fixed?

Cheers,
Thomas

OpenMS returning lots of zeroes

Hi Rick

I am running untargeted metabolomics on a 6545QTOF using patRoon/OpenMS and XCMS for feature extraction.
Compared to XCMS, OpenMS is not giving good results, i.e. integrating noise and returning lots of '0' although I bumped up the noise threshold to 5000 (report attached). Standard parameter were optimized w/ patRoon/IPO.

Any idea?

Cheers,
Thomas

OpenMS report:
featureGroupsOpenMS_noise5000.xlsx

optimizedParameters(ftOpt) Feature detection
$chromFWHM
[1] 2.5

$mzPPM
[1] 8

$minFWHM
[1] 1.5

$maxFWHM
[1] 45.5

$chromSNR
[1] 2.5

$noiseThrInt
[1] 5000

optimizedParameters(fgOpt) Feature grouping
maxAlignRT
[1] 28.5

$maxAlignMZ
[1] 0.002

$maxGroupRT
[1] 6

$maxGroupMZ
[1] 0.0071

Problem finding feature with openms

When working through the tutorial we get the following error when running

fList <- findFeaturesOpenMS(anaInfo)

Finding features with OpenMS for 6 analyses ...
Error in system2(cmd, sapply(args, shQuote), ...) :
error in running command

We are having no problem finding features with xcms

[JOSS Review] Bug: `findFeaturesSIRIUS(...)` fails

When trying to detect features on centroided data (findFeaturesXCMS works fine), I get the following error message.

The files are LC-MS1 files from orbitrap - is it possible that SIRIUS always expects MS2 data to be present?

r$> features <- findFeaturesSIRIUS(info)
Verifying if your data is centroided...
Finding features with SIRIUS for 12 analyses ...
  |=================                                                                                                                                                                                            |   8%Error in setorderv(ret, "mz") : 
  some columns are not in the data.table: mz
In addition: Warning message:
call dbDisconnect() when finished working with a connection 

peak areas altered after conversion and fillChromPeaks()

Hi Rick

The following is more a question than an issue:
I was wondering if the code below is a valid way to employ the fillChromPeaks() function after featureGrouping and conversion or is there a more elegant way (incl. reporting)?

Here comes the issue:
If I run the code seen below and compare peak areas (not of the 'filled' ones) before and after imputation (fGroup vs xdata_filled), suddenly peak areas of all features were altered drastically (x-times larger: 6800 > 142000, 5800 > 163000)!?

I wonder if there is an issue w/ the conversion or fillChromPeaks() function?
Apart from that, the group and retention time column in the exported spreadsheet is also missing.
I can send you a snippet of the spreadsheets if you want.

Cheers,
Thomas

#######################################
### Group and align features between analysis
fGroups <- withOpt(cache.mode="none", groupFeatures(fList, "xcms3", rtalign = TRUE, 
                                                    groupParam = pdp,
                                                    retAlignParam = owp))
### Conversion
xdata <- getXCMSnExp(fGroups)
### Filling missing peak data using the peak area from identified chromatographic peaks
xdata_filled <- fillChromPeaks(xdata, param = ChromPeakAreaParam())

### Reporting xdata
### Extract the feature definitions
ft_def <- featureDefinitions(xdata_filled)
### Get feature information
ft_inf <- featureSummary(xdata_filled, group = xdata_filled$sample_group)
### Get feature abundance
ft_ints <- featureValues(xdata_filled, value = "into")
### Make loop for data export
ft_def_export <- data.frame(ft_def@rownames, ft_def@listData[1])
for(i in 1: (length(ft_def@listData)-1)){
  ft_def_export[,i+1] = ft_def@listData[i]
}

### Export data
ft_def_export <- cbind(ft_def_export, data.frame(ft_ints))
write.csv(ft_def_export, file = "result_xcms_xdata_filled.csv", row.names = FALSE)

Error while running findFeatures

Hi Rick,
I observed an error while running fList <- findFeatures(anaInfo, "xcms3", verbose = TRUE, param = pick_param).

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'X' in selecting a method for function 'bplapply': could not find function "featureData"

I concluded that this error is relatd to xcms. As soon I load xcsms, the error is gone. Thus, I checked the xcms code, but did not found any conclusive code. Maybe @sneumann or @jorainer have an idea?

Thus, I guess that some function import is missing in patRoon to avoid additional loading of xcms.

Best,
Tobias

sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2012 R2 x64 (build 9600)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] xcms_3.12.0 MSnbase_2.16.0 ProtGenerics_1.22.0 S4Vectors_0.28.0 mzR_2.24.0 Rcpp_1.0.5
[7] BiocParallel_1.24.0 Biobase_2.50.0 BiocGenerics_0.36.0 data.table_1.13.2 patRoon_1.0.4 forcats_0.5.0
[13] stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4 readr_1.4.0 tidyr_1.1.2 tibble_3.0.4
[19] ggplot2_3.3.2 tidyverse_1.3.0

loaded via a namespace (and not attached):
[1] snow_0.4-3 readxl_1.3.1 backports_1.2.0 Hmisc_4.4-1
[5] plyr_1.8.6 igraph_1.2.6 CAMERA_1.46.0 splines_4.0.3
[9] GenomeInfoDb_1.26.0 TH.data_1.0-10 digest_0.6.27 foreach_1.5.1
[13] htmltools_0.5.0 fansi_0.4.1 memoise_1.1.0 magrittr_1.5
[17] checkmate_2.0.0 cluster_2.1.0 doParallel_1.0.16 limma_3.46.0
[21] modelr_0.1.8 matrixStats_0.57.0 sandwich_3.0-0 jpeg_0.1-8.1
[25] colorspace_1.4-1 blob_1.2.1 rvest_0.3.6 haven_2.3.1
[29] rbibutils_1.4 xfun_0.19 crayon_1.3.4 RCurl_1.98-1.2
[33] jsonlite_1.7.1 graph_1.68.0 impute_1.64.0 survival_3.2-7
[37] zoo_1.8-8 iterators_1.0.13 glue_1.4.2 gtable_0.3.0
[41] zlibbioc_1.36.0 emmeans_1.5.2-1 XVector_0.30.0 DelayedArray_0.16.0
[45] DEoptimR_1.0-8 scales_1.1.1 vsn_3.58.0 mvtnorm_1.1-1
[49] DBI_1.1.0 xtable_1.8-4 htmlTable_2.1.0 bit_4.0.4
[53] foreign_0.8-80 preprocessCore_1.52.0 Formula_1.2-4 MsCoreUtils_1.2.0
[57] htmlwidgets_1.5.2 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.1
[61] pkgconfig_2.0.3 XML_3.99-0.5 nnet_7.3-14 dbplyr_2.0.0
[65] tidyselect_1.1.0 rlang_0.4.8 later_1.1.0.1 cellranger_1.1.0
[69] munsell_0.5.0 tools_4.0.3 cli_2.1.0 RSQLite_2.2.1
[73] generics_0.1.0 broom_0.7.2 fastmap_1.0.1 mzID_1.28.0
[77] bit64_4.0.5 fs_1.5.0 knitr_1.30 robustbase_0.93-6
[81] RANN_2.6.1 packrat_0.5.0 ncdf4_1.17 RBGL_1.66.0
[85] mime_0.9 xml2_1.3.2 compiler_4.0.3 rstudioapi_0.11
[89] png_0.1-7 affyio_1.60.0 reprex_0.3.0 MassSpecWavelet_1.56.0
[93] stringi_1.5.3 lattice_0.20-41 Matrix_1.2-18 vctrs_0.3.4
[97] pillar_1.4.6 lifecycle_0.2.0 BiocManager_1.30.10 Rdpack_2.0
[101] MALDIquant_1.19.3 estimability_1.3 bitops_1.0-6 gbRd_0.4-11
[105] httpuv_1.5.4 GenomicRanges_1.42.0 R6_2.5.0 latticeExtra_0.6-29
[109] pcaMethods_1.82.0 affy_1.68.0 promises_1.1.1 gridExtra_2.3
[113] IRanges_2.24.0 codetools_0.2-18 MASS_7.3-53 assertthat_0.2.1
[117] SummarizedExperiment_1.20.0 withr_2.3.0 multcomp_1.4-14 GenomeInfoDbData_1.2.4
[121] hms_0.5.3 fst_0.9.4 grid_4.0.3 rpart_4.1-15
[125] coda_0.19-4 MatrixGenerics_1.2.0 rsm_2.10.2 shiny_1.5.0
[129] lubridate_1.7.9 base64enc_0.1-3

Grouping optimization issue (sub-optimal formulation for "score")

Hi,

first off, thanks for your nice code package, I switched over after having some issues with IPO. The comparatively verbose output is particularly useful!

This made me notice an issue with the grouping optimization algorithm, which attempts to optimize norm(GS)+norm(RCS). Unfortunately that can lead to some issues in edge cases as in the results below. The optimization nominally succeeded, but clearly is not a good choice overall based on the abysmal GS.

In many scenarios, it would seem to help to carry forward GS and RCS between iterations, and calculating the overall norms and scores on the total set, rather than just within the iteration (would also be more consistent with the behavior of feature finding optimization). But overall, some non-linearity in weighing GS and RCS may be required to really avoid falling into parameter space minima where one drastically dominates the other,

Regards, Thomas

Response: (iteration 2)

exp_index good_groups bad_groups GS RCS retcor_done experiment score
1 22 53271 9.085619e-03 7711.6919 1 1 0.9353964
2 3132 20278 4.837471e+02 611.6350 1 2 1.0039817
3 21 51671 8.534768e-03 7452.1002 1 3 0.9013410
4 2948 19128 4.543446e+02 581.2830 1 4 0.9392182
5 49 52094 4.608976e-02 8204.1659 1 5 1.0000776
6 3018 20606 4.420229e+02 630.4261 1 6 0.9201931
7 43 50487 3.662329e-02 7940.6152 1 7 0.9654844
8 2868 19425 4.234453e+02 601.1880 1 8 0.8779533
9 2176 26091 1.814793e+02 1009.5788 1 9 0.4313277
10 37 52515 2.606874e-02 7839.8853 1 10 0.9522485
11 2907 20047 4.215418e+02 604.1467 1 11 0.8744066
12 2215 26942 1.821032e+02 1040.7310 1 12 0.4367042
13 2086 25707 1.692689e+02 1000.2034 1 13 0.4048561
14 2164 26064 1.796691e+02 1009.0638 1 14 0.4275181
15 2494 25309 2.457638e+02 1006.8286 1 15 0.5638580
16 2176 26091 1.814793e+02 1009.5788 1 16 0.4313277

Best params: groupArgs: list(bw = 0.25, mzwid = 0.00912, method = "density"); retcorArgs: list(distFunc = "cor_opt", gapInit = 0.3, gapExtend = 2.4, profStep = 1, method = "obiwarp");
Best results: exp_index: 1; good_groups: 45; bad_groups: 51404; GS: 0.0393938214924909; RCS: 8105.90554502488; retcor_done: 1

issue with xcms3 peakgroups

Hi,

I can't seem to get xcms3 optimization to work with peakgroups algorithm... Seems like the initial grouping isn't called somehow? Switched to a different package for now.

fgOpt <- optimizeFeatureGrouping(optimizedObject(ftOpt), "xcms3", paramsRG, maxIterations = 1)
Starting new DoE (iteration 1):
retAlignMethod: peakgroups
groupMethod: density
bw: c(5, 7)
binSize: 0.005
minFraction: 0.9
extraPeaks: 1
span: c(0.2, 0.3)
---
Design:
   run.order std.order bw span Block
1          1         1  5 0.20     1
2          2         2  7 0.20     1
3          3         3  5 0.30     1
4          4         4  7 0.30     1
5          5         5  6 0.25     1
6          1         1  5 0.25     2
7          2         2  7 0.25     2
8          3         3  6 0.20     2
9          4         4  6 0.30     2
10         5         5  6 0.25     2
---

  |                                                                                                                   
  |                                                                                                             |   0%Error in .local(object, param, ...) : 
  No feature definitions found in 'object'! Please perform first a peak grouping using the 'groupChromPeak' method.

Problem getting started

Hi! I have tried to follow your tutorial using the example data package. After filling in the specified settings in the dialogue box that appears after typing in newProject() and clicking "create", Rstudio opens a new window which is empty apart from some standard text on Rstudio (and the window I worked in disappears). I suppose that what is supposed to happen is that there is something to work with in that new window. Do you know why this happens? I have made sure that I have downloaded the example data package. When I launch patROON I get a message saying that "mzR has been built against a different Rcpp version". I do have the latest verion of mzR. I also tried uninstalling and reinstalling it. In doing this, I got the message "Installation path not writeable, unable to update packages: survival". I got the same message when trying to upgrade the bioconductor installation.

small typo in the install script

Hello Rick,

today i installed patRoon on a new machine using the automatic windows install script. The script fails at some point because of a typo in line 558:

                # place in jar from patRoonDeps
                jar <- downloadFile(file.path(dow, "biotransformer"), "BioTransformer jar",
                                    "https://github.com/rickhelmus/patRoonDeps/raw/master/ext/biotransformer-3.0.0.jar",
                                    FALSE)

it says "dow", when it should be "down". Changing that allows the script to finish.

Regards,

Martin

[JOSS Review]: Document option to use SAFD

In the documentation, there is a line describing the options that have to be set in order to use the external dependencies - there is none for SAFD though. Is that intentional? Or hos can SAFD be used with patRoon?

Potential bug in optimizeFeatureGrouping of XCMS3 parameters

Hi Rick,

Thank you for creating this excellent pipeline for nontargeted metabolomics analysis.

I have been using the IPO implementation to optimize feature grouping parameters in xcms3 (via optimizeFeatureGrouping), and have run into issues when optimizing binSize and minFraction when groupMethod='density'. It appears that these values are rounded which causes an error when binSize is rounded down to 0 (see output below).

image

Is it possible that this line of code is causing the issue?

fixDesignParam = function(param, value) if (param %in% c("binSize", "minFraction")) round(value) else value,

Thanks again,
Colton

Installation of patRoon fails

After installing all dependencies I get the following error when trying to install patRoon from GitHub:

install_github("rickhelmus/patRoon", dependencies=TRUE)
Downloading GitHub repo rickhelmus/patRoon@master
from URL https://api.github.com/repos/rickhelmus/patRoon/zipball/master
Installing patRoon
"C:/PROGRA1/R/R-331.2/bin/x64/R" --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL
"C:/Users/PVervliet/AppData/Local/Temp/RtmpMbGbTI/devtools212c51554616/rickhelmus-patRoon-87692e7"
--library="C:/Users/PVervliet/Documents/R/win-library/3.3" --install-tests

  • installing source package 'patRoon' ...
    ** libs
    running 'src/Makefile.win' ...
    g++ -c -o GenForm-svn/main.o GenForm-svn/main.cpp
    make: g++: Command not found
    make: *** [: GenForm-svn/main.o] Error 127
    Warning: running command 'make --no-print-directory -f "Makefile.win"' had status 2
    ERROR: compilation failed for package 'patRoon'
  • removing 'C:/Users/PVervliet/Documents/R/win-library/3.3/patRoon'
    Installation failed: Command failed (1)

I already reinstalled Rtools to be sure but the error keeps showing up.

[JOSS Review] Create individual doc page for every function and class in handbook

The individual pipeline steps are all documented in a single page (here), making it very hard do actually understand the parameters, as the section covers all functions. It would be a lot easier if each function actually had its own page and the return value would be the class, and the class having its own documentation page again - I think this would make the documentation a lot better!

issue with matchedFilter (another small xcms3 optimization bug)

Hi again,

I just noticed that congratulations are in order for your publication!

Anyway, similar to my attempt at switching to peakgroups (thanks for the rapid fix!) I'm now also giving matchedFilter a go. It seems that the variable names there need an update:
generateFeatureOptPSet() generates a set containing the parameter "step", which is not recognized by xcms. The correct name should now be "binSize", I believe.

Warning message during run of findFeatures

Hi Rick,
While running fList <- findFeatures(anaInfo, "xcms3", verbose = TRUE, param = xset), this warnings show up. I guess it originates from the parallel run. You may check the run call. Obviously stats needs to be called explicitly for each node.

Best,
Tobias

Warning messages:
1: In serialize(data, node$con) :
'package:stats' may not be available when loading
2: In serialize(data, node$con) :
'package:stats' may not be available when loading
3: In serialize(data, node$con) :
'package:stats' may not be available when loading
4: In serialize(data, node$con) :
'package:stats' may not be available when loading
5: In serialize(data, node$con) :
'package:stats' may not be available when loading
6: In serialize(data, node$con) :
'package:stats' may not be available when loading
7: In serialize(data, node$con) :
'package:stats' may not be available when loading

feature suggestion: include refineChromPeaks in xcms3 feature finding optimization

Hi again, this time for a feature suggestion rather than a bug:
It would be nice to also to be able to sweep/optimize the xcms3 MergeNeighboringPeaksParam() parameters of the refineChromPeaks() function. Probably it would be easiest to just directly integrate this as additional parameters into featuresOptimizerXCMS3(), just like featureGroupsOptimizerXCMS3 integrates both grouping and retention time correction. Just a thought for future versions

[JOSS Review] BUG: Can't centroid `mzml` data using `convertMSFiles`

To centroid existing mzml files, the convertMSFiles function could be used (I assume). The problem is that the function throws an exception if the conversion doesn't change the format but does only the centroiding. Maybe a dedicated centroiding function or lifting this restriction on changing the format could also do the trick?

Linking parents and TPs... Error generating the `precursor` object

Hi,

I hope I can get some help with this.

I've been trying to run Non-target screening of predicted TPs. when I run this part of the code adapted from the example workflows (from example 7.3.3 in the handbook):
step 4
suspects <- convertToSuspects(TPs)

fGroupsScr <- screenSuspects(fGroups, suspects, adduct = "[M-H]-", onlyHits = FALSE)

step 5
componTP <- generateComponents(fGroupsScr, "tp", TPs = TPs, MSPeakLists = mslists, formulas = formulas)
I get this Error :
in (function (spec, gn, ana) : object 'precursor' not found
I believe the code in step 4 was supposed to generate this object precursor but it didn't. but step 4 ran without any errors

Naming of filter function

Hi Rick,
I have another issue. I suggest to rename the filter function of patRoon. The naming interferes with the tidyverse filter function and causes errors, if not called explicitly.

Best,
Tobias

[JOSS Review] Make file extension for supported MS file types case insensitive

generateAnalysisInfo finds mzML files but no mzml files, which can be impractical if working with large read-only file storage.

Seems like it starts to collect everything at first and then filterMSFileDirs actually filters case sensitive for the extensions again, so they are removed.

patRoon/R/convert.R

Lines 86 to 95 in b0bffba

listMSFiles <- function(dirs, from)
{
allExts <- MSFileExtensions()
allExts <- unique(unlist(allExts[from]))
files <- list.files(dirs, full.names = TRUE, pattern = paste0("*\\.", allExts, "$", collapse = "|"),
ignore.case = TRUE)
return(filterMSFileDirs(files, from))
}

parallelism and timeouts errors

patRoon 1.2.0

As part of my analysis, I was getting random errors while executing this line:

formulas <- generateFormulas(fGroups, "genform", mslists, relMzDev = 5,
								 adduct = adduct, elements = "CHNOClBrSP",
								 calculateFeatures = FALSE, featThreshold = 0.75)

and this line:

compounds <- generateCompounds(fGroups, mslists, "metfrag", method = "CL",
								   dbRelMzDev = 5, fragRelMzDev = 5, 
								   fragAbsMzDev = 0.002, adduct = "[M+H]+", 
								   database = "csv", extraOpts = 
									 list(LocalDatabasePath = myLocalDatabasePath), scoreTypes = c("fragScore","metFusionScore","score", "individualMoNAScore"),
								   maxCandidatesToStop = 100)

The progress bars would stop at a random point and then rerunning the experiment would sometimes work fine, others crash with a NULL message. The crashes were much more frequent when running in an HPC with lots of cores. After setting patRoon.MP.maxProcs = 1 the random behavior was gone, but still I would get some crashes. I only managed to fix those by noticing the genform binary would take too much time sometimes (fixed with timeout = 5000), and that metfrag would also take too much time (fixed with errorRetries = 200, timeoutRetries = 20).

In general I think there seems to be some bug in the implementation of the paralellism part, but a more pressing issue would be to have much better error messages when external commands fail, as in the current state it is really hard to find out what the error was. Is it documented anywhere how to debug such issues?

[JOSS Review] Bug: Let `findFeatures` functions also accept files which have lowercase file extension

Similar to when importing the files into the info, the findFeatures functions don't accept lowercase filenames etc. which have an .mzml extension.

> info <- patRoon::generateAnalysisInfo("myvolume")
> features <- patRoon::findFeaturesXCMS3(info)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201001_003-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201001_098-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201009_003-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201009_098-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201016_003-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201016_098-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201021_003-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201021_098-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201202_003-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201202_098-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201209_003-QCstd_POS_MU.mzml] (in myvolume)
Analysis does not exist: Galaxy23-[qc_solvent_centroids__Tribrid_201209_090-QCstd_POS_MU.mzml] (in myvolume)
Error in patRoon::findFeaturesXCMS3(info) : 1 assertions failed:
 * Variable 'x': No analyses found with correct data format (valid: mzXML, mzML).

Finding features with OpenMS or XCMS

Hello Rick,

i am new in patroon and wanted to use the workflow. So i have completed the installation and can also run the example data but if i try to find features in my data it always gives me this notification. The data was a full scan with ddMS2 acquisition. Is this issue a problem with administrator privileges on the PC.

With OpenMS

fList <- findFeatures(anaInfo, "openms")
Finding features with OpenMS for 2 analyses ...
| 0%Fehler: Can not open file C:/Users/User/Documents/analyses/raw/SMX.mzML! Original error was: Error in pwizModule$open(filename): Invalid cvParam accession "1003112"

With XCMS

Feature statistics:
SMX: 0 (0.0%)
solvent-1: 35 (100.0%)
Total: 35
Warnmeldungen:
1: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
2: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
3: In xcmsSet(files, analysisInfo$analysis, analysisInfo$group, method = method, :
Peak detection failed in 'C:/Users/User/Documents/analyses/raw/SMX.mzML':Error: Can not open file C:/Users/User/Documents/analyses/raw/SMX.mzML! Original error was: Error in pwizModule$open(filename): Invalid cvParam accession "1003112"

Greetings

Error: findFeatures(..., 'openms')

Hello,

When i created newproject(), the script was generated automatically.
However, when i run findfeatures(...,'openms')

this error came:

0%
Error in errorHandler(cmd = commandQueue[[ci]], exitStatus = exitStatus, :
Failed to run command 'C:/Program Files/OpenMS-2.3.0/bin/FeatureFinderMetabo.exe' with args: '-algorithm:common:noise_threshold_int 1000 -algorithm:common:chrom_fwhm 5 -algorithm:mtd:mass_error_ppm 10 -algorithm:mtd:trace_termination_criterion sample_rate -algorithm:mtd:min_trace_length 3 -algorithm:mtd:max_trace_length -1 -algorithm:epd:width_filtering fixed -algorithm:epd:min_fwhm 3 -algorithm:epd:max_fwhm 60 -algorithm:ffm:report_convex_hulls true -in C:/~/tiny4_LTQ-FT.mzML0.99.0.mzML -out C:~\AppData\Local\Temp\RtmpcFSN1B\tiny4_LTQ-FT.mzML0.99.029ec5cd575ec.featureXML'. Exit code: 3

Anyway, I used analyse .mzml data.
I did not use DataAnalysis Method in setup.

Thank you in advance,

Regards,

Unable to install RDCOMClient

I get the error message that the required RDCOMClient cannot be installed:

after running **devtools::install_github('omegahat/RDCOMClient') ** this message is given (see below for full script):

WARNING: Rtools is required to build R packages, but no version of Rtools compatible with R 3.5.0 was found. (Only the following incompatible version(s) of Rtools were found:3.4,3.5)

Please download and install the appropriate version of Rtools from http://cran.r-project.org/bin/windows/Rtools/.
Installation failed: Could not find build tools necessary to build RDCOMClient

#What I've run according to your instructions:

source("https://bioconductor.org/biocLite.R")
biocLite(c("mzR", "xcms", "CAMERA"))

install.packages("RDCOMClient") # Not available for R.3.5 . Tried github install after

install.packages("devtools") # needed only if not already installed
devtools::install_github("cbroeckl/RAMClustR", build_vignettes = TRUE, dependencies = TRUE)
#devtools::install_github("c-ruttkies/MetFragR/metfRag") # only when using the R interface (not by default)
devtools::install_github('omegahat/RDCOMClient') # gives error message described above

minor bug in optimization DoE (attempting single parameter sweep)

I seem to have come across a minor bug:
Attempting to optimize feature finding or grouping for a single parameter, at least with xcms, seems to:

  1. not display properly, but instead print a range of -1 to 1
  2. the code seems to run with the actual input parameters as far as I can tell, but the optimization result is clearly off

Regards,
Thomas

input:

paramsPP <- generateFeatureOptPSet('xcms')
paramsPP$max_peakwidth <- 160
paramsPP$min_peakwidth <- 22.5
paramsPP$mzdiff <- 0.00125
paramsPP$ppm <- c(4,10)
paramsPP$snthresh <- 10
paramsPP$noise <- 1500
paramsPP$prefilter <- 5
paramsPP$value_of_prefilter <- 5000
paramsRG <- generateFGroupsOptPSet('xcms')
paramsRG$retcorArgs$gapInit <- 0.3
paramsRG$retcorArgs$gapExtend <- 2.4
paramsRG$retcorArgs$profStep <- 0.1
paramsRG$groupArgs$bw <- c(15,19)
paramsRG$groupArgs$mzwid <- c(0.005,0.015)
ftOpt <- optimizeFeatureFinding(anaInfo, 'xcms', paramsPP, maxIterations = 1)

output:

---
Design:
  run.order   ppm
1         1 -1.00
2         2 -0.75
3         3 -0.50
4         4 -0.25
5         5  0.00
6         6  0.25
7         7  0.50
8         8  0.75
9         9  1.00
---

  |                                                                                                                                                                                                                                              
  |                                                                                                                                                                                                                                        |   0%
  |                                                                                                                                                                                                                                              
  |==========================                                                                                                                                                                                                              |  11%
  |                                                                                                                                                                                                                                              
  |====================================================                                                                                                                                                                                    |  22%
  |                                                                                                                                                                                                                                              
  |=============================================================================                                                                                                                                                           |  33%
  |                                                                                                                                                                                                                                              
  |=======================================================================================================                                                                                                                                 |  44%
  |                                                                                                                                                                                                                                              
  |=================================================================================================================================                                                                                                       |  56%
  |                                                                                                                                                                                                                                              
  |===========================================================================================================================================================                                                                             |  67%
  |                                                                                                                                                                                                                                              
  |====================================================================================================================================================================================                                                    |  78%
  |                                                                                                                                                                                                                                              
  |==============================================================================================================================================================================================================                          |  89%
  |                                                                                                                                                                                                                                              
  |========================================================================================================================================================================================================================================| 100%

---
Response:
   featureCount nonRP    RP      PPS experiment    score
1:        73010 38538 16007 6648.608          1 6648.608
2:        73169 38625 16048 6667.658          2 6667.658
3:        73333 38683 16071 6676.758          3 6676.758
4:        73423 38732 16092 6685.750          4 6685.750
5:        73540 38772 16090 6677.192          5 6677.192
6:        73671 38865 16093 6663.699          6 6663.699
7:        73840 38988 16059 6614.637          7 6614.637
8:        73826 39036 15998 6556.410          8 6556.410
9:        73518 38886 15958 6548.829          9 6548.829
---
Best params: ppm: 4; method: centWave; mzdiff: 0.00125; snthresh: 10; noise: 1500; prefilter: c(5, 5000); peakwidth: c(22.5, 160); 
Best results: featureCount: 73010; nonRP: 38538; RP: 16007; PPS: 6648.60784161088;

[JOSS Review] BUG: `groupFeaturesKPIC2` fails after `findFeaturesXCMS` if `loadRawData = TRUE`

I used patRoon via the docker image and used 12 profile LC-ESI+-MS1 mzML files, then ran feature detection using XCMS and tried to group features via KPIC2 and got the error listed in the log below.

After rerunning with centWave as explicit method and adding group information to the analysis I get the following error:

> groups_kpic2_retry <- patRoon::groupFeaturesKPIC2(features)
Grouping features with KPIC2... Error in rep(0, curdiff) : invalid 'times' argument

bug_report.txt

[JOSS Review] BUG: Building rstudio docker image fails

Building the docker image (exclusively for developers) fails in VSCode with the Docker extension with the error message given below. I have quite a vanilla docker and only enabled the docker-build tools which were required for this command. I've tried building the image with different base directories for executing the docker command but I get the same error. I'm likely doing something wrong, but I can't find any hints in the docs about it.

> Executing task: docker build --pull --rm -f "git/hechth/patRoon/docker/rstudio/Dockerfile" -t hechth/patroon:latest "git/hechth/patRoon/docker/rstudio" <

[+] Building 3.7s (12/12) FINISHED                                                                                                                                                      
 => [internal] load build definition from Dockerfile                                                                                                                               0.0s
 => => transferring dockerfile: 753B                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                  0.1s
 => => transferring context: 2B                                                                                                                                                    0.0s
 => resolve image config for docker.io/docker/dockerfile:experimental                                                                                                              1.1s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                                                                                   0.0s
 => CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:600e5c62eedff338b3f7a0850beb7c05866e0ef27b2d2e8c02aa468e78496ff5                                         0.0s
 => [internal] load metadata for docker.io/patroonorg/patroondeps:latest                                                                                                           0.7s
 => [auth] patroonorg/patroondeps:pull token for registry-1.docker.io                                                                                                              0.0s
 => [internal] load build context                                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                                    0.0s
 => [stage-0 1/3] FROM docker.io/patroonorg/patroondeps@sha256:88e3e5f63c2a52256a8502c82329439a66a5cec85c0d1b9f2fbad46b93013a36                                                    0.0s
 => CACHED [internal] settings cache mount permissions                                                                                                                             0.0s
 => CACHED [stage-0 2/3] ADD --chown=rstudio . patRoon                                                                                                                             0.0s
 => ERROR [stage-0 3/3] RUN --mount=type=cache,id=ccache,target=/home/rstudio/ccache,uid=1000,gid=1000 Rscript -e 'devtools::install(pkg = "patRoon", upgrade = FALSE)'            1.3s
------
 > [stage-0 3/3] RUN --mount=type=cache,id=ccache,target=/home/rstudio/ccache,uid=1000,gid=1000 Rscript -e 'devtools::install(pkg = "patRoon", upgrade = FALSE)':
#12 1.101 Error: Could not find package root. Is your working directory inside a package?
#12 1.101 Execution halted
------
executor failed running [/bin/sh -c Rscript -e 'devtools::install(pkg = "patRoon", upgrade = FALSE)']: exit code: 1
The terminal process "bash '-c', 'docker build --pull --rm -f "git/hechth/patRoon/docker/rstudio/Dockerfile" -t hechth/patroon:latest "git/hechth/patRoon/docker/rstudio"'" terminated with exit code: 1.

[JOSS Review] BUG: `generatePeakListsMzR` failes with feature groups from KPIC2

After creating the feature groups with KPIC2 I tried generating the peak lists and the method threw a bunch of errors, see the log file below.

Is this related to some parameters missing in the previous function call? Or could it be related to the size of the files/data? Each file is about 700MB, so quite large, and acquired in profile mode and converted without centroiding.

> plists <- patRoon::generateMSPeakListsMzR(groups_kpic2)
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201001_003-QCstd_POS_MU'...
  |================================================================================                                  |  70%2022-01-06T14:26:35.239850Z [rsession-rstudio] ERROR system error 2 (No such file or directory) [path: /home/rstudio/cache.sqlite-journal]; OCCURRED AT time_t rstudio::core::FilePath::getLastWriteTime() const src/cpp/shared_core/FilePath.cpp:1052; LOGGED FROM: time_t rstudio::core::FilePath::getLastWriteTime() const src/cpp/shared_core/FilePath.cpp:1052
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201001_098-QCstd_POS_MU'...
  |========                                                                                                          |   7%2022-01-06T14:28:28.379162Z [rsession-rstudio] ERROR system error 2 (No such file or directory) [path: /home/rstudio/cache.sqlite-journal]; OCCURRED AT uintmax_t rstudio::core::FilePath::getSize() const src/cpp/shared_core/FilePath.cpp:1126
  |==================================================                                                                |  44%2022-01-06T14:29:56.752441Z [rsession-rstudio] ERROR system error 2 (No such file or directory) [path: /home/rstudio/cache.sqlite-journal]; OCCURRED AT time_t rstudio::core::FilePath::getLastWriteTime() const src/cpp/shared_core/FilePath.cpp:1052; LOGGED FROM: time_t rstudio::core::FilePath::getLastWriteTime() const src/cpp/shared_core/FilePath.cpp:1052
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201009_003-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201009_098-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201016_003-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201016_098-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201021_003-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201021_098-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201202_003-QCstd_POS_MU'...
  |==================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201202_098-QCstd_POS_MU'...
  |===================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201209_003-QCstd_POS_MU'...
  |===================================================================================================================| 100%
Loading all MS peak lists for 3821 feature groups in analysis 'Tribrid_201209_090-QCstd_POS_MU'...
  |===================================================================================================================| 100%
Generating averaged peak lists for all feature groups...
  |===================================================================================================================| 100%
There were 50 or more warnings (use warnings() to see the first 50)

Issues using docker image

Hello,

First, I would like to thank you for all the effort in building this package. I consider it integrates non-vendor metabolomics software in a sophisticated way.
I am using the latest Docker image of patRoon and its dependencies. I have experienced the following issues:

  1. When running genform

image

  1. When generating reportHTML() I got the following error and the final report never opens in the browser.

image

Thanks again

patRoon does not contain enviMass

Please remove the enviMass web link from your github package description: the described "grouping of features" (= profiling of peaks of similar mass/RT over different files) has recently been confused by users with the componentization step of enviMass (which patRoon takes from our nontarget package); and even the extraction and grouping of features from enviMass can only partly be embedded in patRoon or other workflows. Thus, providing the impression that patRoon harmonizes or contains any version of enviMass is pretty misleading; the more as there are lots of enviMass features which cannot be embedded in other workflows out of their specific enviMass workflow context that easily and without risks. Thanks for your understanding!

findFeatures error

Hello Rick,

I seem to be having an issue with the findFeatures function when applying to my data. I had converted an Agilent .d file externally with MSConvert to mzML.

Upon attempting to find features I receive this error below:
findFeatures error

This only happens with my own files and not with example data. I have tried cutting the file down in size, etc, but have yet to resolve the issue. Any help is greatly appreciated.

Thanks, Rachel

fGroups issue

Hi,

I am getting this error ,

fGroups <- groupFeatures(fList, "xcms3", rtalign = TRUE,
groupParam = xcms::PeakDensityParam(sampleGroups =
analysisInfo(fList)$group, bw = 10, minFraction = 0.5,
minSamples = 1, binSize = 0.01, maxFeatures = 50),
retAlignParam = xcms::ObiwarpParam(binSize = 1,
centerSample = 10, response = 100, distFun = "cor_opt",
gapInit = 0.3, gapExtend = 2.4, factorDiag = 2,
factorGap = 1, localAlignment = FALSE,
initPenalty = 0))
Grouping features with XCMS...

Performing retention time alignment...
Sample number 10 used as center sample.
Aligning Blank1_HILIC_neg.mzML against 1093_neg.mzML ...
Error: BiocParallel errors
1 remote errors, element index: 1
4 unevaluated and other errors
first remote error: cannot open the connection
In addition: Warning messages:
1: In serialize(data, node$con) :
'package:stats' may not be available when loading
2: In serialize(data, node$con) :
'package:stats' may not be available when loading
3: In serialize(data, node$con) :
'package:stats' may not be available when loading
4: In serialize(data, node$con) :
'package:stats' may not be available when loading
5: In serialize(data, node$con) :
'package:stats' may not be available when loading
6: In serialize(data, node$con) :
'package:stats' may not be available when loading
7: In serialize(data, node$con) :
'package:stats' may not be available when loading
8: stop worker failed:
wrong args for environment subassignment

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 65001

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] xcms_3.16.1 MSnbase_2.20.1 ProtGenerics_1.26.0 S4Vectors_0.32.3 mzR_2.28.0 Rcpp_1.0.7 Biobase_2.54.0 BiocGenerics_0.40.0
[9] BiocParallel_1.28.2 patRoon_1.2.0

loaded via a namespace (and not attached):
[1] bitops_1.0-7 matrixStats_0.61.0 bit64_4.0.5 doParallel_1.0.16 RColorBrewer_1.1-2
[6] GenomeInfoDb_1.30.0 backports_1.4.0 tools_4.1.2 utf8_1.2.2 R6_2.5.1
[11] affyio_1.64.0 DBI_1.1.1 colorspace_2.0-2 withr_2.4.3 tidyselect_1.1.1
[16] bit_4.0.4 compiler_4.1.2 MassSpecWavelet_1.60.0 preprocessCore_1.56.0 DelayedArray_0.20.0
[21] checkmate_2.0.0 scales_1.1.1 DEoptimR_1.0-9 robustbase_0.93-9 affy_1.72.0
[26] digest_0.6.29 XVector_0.34.0 pkgconfig_2.0.3 htmltools_0.5.2 fst_0.9.4
[31] MatrixGenerics_1.6.0 fastmap_1.1.0 limma_3.50.0 rlang_0.4.12 RSQLite_2.2.9
[36] impute_1.68.0 shiny_1.7.1 generics_0.1.1 mzID_1.32.0 dplyr_1.0.7
[41] RCurl_1.98-1.5 magrittr_2.0.1 GenomeInfoDbData_1.2.7 MALDIquant_1.20 Matrix_1.3-4
[46] munsell_0.5.0 fansi_0.5.0 MsCoreUtils_1.6.0 lifecycle_1.0.1 vsn_3.62.0
[51] MASS_7.3-54 SummarizedExperiment_1.24.0 zlibbioc_1.40.0 plyr_1.8.6 blob_1.2.2
[56] grid_4.1.2 parallel_4.1.2 promises_1.2.0.1 crayon_1.4.2 lattice_0.20-45
[61] MsFeatures_1.2.0 pillar_1.6.4 GenomicRanges_1.46.1 codetools_0.2-18 XML_3.99-0.8
[66] glue_1.5.1 pcaMethods_1.86.0 data.table_1.14.2 BiocManager_1.30.16 vctrs_0.3.8
[71] httpuv_1.6.3 Rdpack_2.1.3 foreach_1.5.1 gtable_0.3.0 RANN_2.6.1
[76] purrr_0.3.4 clue_0.3-60 assertthat_0.2.1 cachem_1.0.6 ggplot2_3.3.5
[81] mime_0.12 rbibutils_2.2.5 xtable_1.8-4 later_1.3.0 ncdf4_1.18
[86] snow_0.4-4 tibble_3.1.6 iterators_1.0.13 memoise_2.0.1 IRanges_2.28.0
[91] cluster_2.1.2 ellipsis_0.3.2

xcms3 optimizeFeatureFinding requires pre-loading library(xcms)

Hi, I've been using your library (with xcms algorithm) for a while now and have appreciated the available workflows! Now I recently tried to "upgrade" my pipeline to xcms3 algorithm instead, but my workflow fails at the Feature Finding Optimization.

Details (this is in R4.0.3):

  • library(patRoon)
  • create analysisInfo (e.g. for sample data set)
  • paramsPP <- generateFeatureOptPSet('xcms3')
  • ftOpt <- optimizeFeatureFinding(anaInfo, 'xcms3', paramsPP)

Expected output:
optimized parameters

Actual output:
"error in evaluating the argument 'X' in selecting a method for function 'bplapply': could not find function "featureData" "

To fix:

  • library(xcms)
    [after loading patRoon]

Error : Failed to run genform

Hello,

I tried to run your app and it worked well until the formula calculation step.
Indeed, I tried this command :

formulas <- generateFormulas(fGroupsopenms, "genform", plists, maxMzDev = 5, adduct = "[M+H]+", elements = "CHNOPSCl")

which gives me this error :

Error in errorHandler(cmd = commandQueue[[ci]], exitStatus = exitStatus, :
Failed to run command 'C:/Users/167160/Documents/R/R-3.5.3/library/patRoon/bin//x64/GenForm.exe' with args: 'exist oei noref dbe cm ion=[M+H]+ ppm=5.000000 el=CHNOPSCl het ms=C:\Users\167160\AppData\Local\Temp\Rtmp0EPclk\MSPList2a7c704718b1.txt m=736.136658 out=C:\Users\167160\AppData\Local\Temp\Rtmp0EPclk\formulas2a7c6265a70.txt'. Exit code: 1

So I tried to add some complementary arguments to the function (oei, dbe and cm) which were asked to be put in as a list of character, so I tried to run :

formulas <- generateFormulas(fGroupsopenms, "genform", plists, adduct = "[M+H]+", elements = "CHNOPSCl", extraOpts = list(as.character("oei", "dbe" ,"cm")))

And this error came :

Error in generateFormulas(fGroupsopenms, "genform", plists, adduct = "[M+H]+", :
1 assertions failed:
Variable 'extraOpts': Vector must be named, but is NULL

When trying to run the same command by changing extraOpts into extraOpts = NULL, I have the first error message again.
Do you have a clue on how to solve this issue? I tried to look for more information about genform but I didn't find how to write the extraOpts argument in the right way.
Anyway, thank you for your work and thank you in advance for your answer,

Nina

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.