Code Monkey home page Code Monkey logo

Comments (26)

jorainer avatar jorainer commented on June 8, 2024

That reminds me of random errors we were getting in MSnbase (issues lgatto/MSnbase#151 and lgatto/MSnbase#138). The latter were clearly caused by memory-allocating problems in the C-code (more specifically, the garbage collector freeing memory that was still used by the C-code). The former by some yet unresolved bug/problem in mzR.

I'll check into this; due to it's randomness I'm suspecting memory problems. I'm suspecting here some C-code the mzR package (used for reading the files).

Did you get the same when running it without parallel processing?

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

Indeed, I do get the same error without parallel processing, both for devel and prod 3.3.

Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"missing"’
Calls: xcmsSet ... FUN -> do.call -> findPeaks.centWave ->

I notice that when parallel processing there are as many separate processes running as computational threads, so it is not surprising that without parallelization I get the same outcome.

I will try again (with parallelization) on my own Ubuntu machine - less power, but we will see what the outcome is - I don't know for sure that there isn't a hardware problem on the virtualization host that hosts the Windows and Debian guests where I have experienced the problem.

Update: I never could get mzR to install on my Ubuntu box (running R from a conda environment), whether from devel or prod 3.3. biocLite("mzR") always failed with:

** testing if installed package can be loaded
Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared object '/zpdrone/drone/home/art/R/x86_64-pc-linux-gnu-library/3.3/mzR/libs/mzR.so':
  /usr/lib/x86_64-linux-gnu/libhogweed.so.4: undefined symbol: __gmpn_cnd_sub_n
Error: loading failed
Execution halted
ERROR: loading failed

Searching on the web failed to turn up any useful alternatives.

from xcms.

jorainer avatar jorainer commented on June 8, 2024

I have implemented a new import function that uses (presumably) newer code from the mzR package to read MS files, i.e. uses mzR::header and mzR::peaks to read the data.
Could you please check if this fixes your problems (I'll run tests in parallel too)?

To install:

library(devtools)
install_github("sneumann/xcms", ref = "xcms3")

library(xcms)
## Enable using the new code:
useOriginalCode(FALSE)

The useOriginalCode(FALSE) is important, as only this enables the new code.

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

Johannes,

Please confirm with this log process-2012-resin-full-xcms3-multicore.log.txt that I ran the experiment as I should have; this was with MultiCoreParam and ran with four forked processes.

Unfortunately, again, this worked on a small scale but was unsuccessful when I gave it many files to work with as four threads.

I will retry with SerialParam instead of MultiCoreParam and report the results.

Update: after about 162 ROIs and Peaks I got:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"NULL"’
Calls: xcmsSet ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
Execution halted

I now will try Steffan's suggestion to try a large set of netCDF files.

from xcms.

sneumann avatar sneumann commented on June 8, 2024

Hi, to check that it is not mzR I suggest to try a large set of netCDF files.
One can convert to netCDF with the code below. Caveat, only xcms can read this netCDF flavour,
other tools complain. Yours, Steffen

library(xcms)
library(mtbls2)
data(mtbls2)

dummy <- sapply(filepaths(mtbls2Set), function(x) {write.cdf(xcmsRaw(system.file(x, package="mtbls2")), file=paste("/tmp/", basename(x), ".cdf", sep=""))})

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

Steffan,

The test passed:
netCDF-test.log.txt

from xcms.

jorainer avatar jorainer commented on June 8, 2024

Interesting.
For reading mzML files we can choose in mzR to use either the Ramp backend or the pwiz backend, I'll check for both.

Just an update on the errors: on my files (about 350) I don't get any errors (neither with BioC 3.3 nor with BioC 3.4) when I use the centWave's default settings. When I use your settings I get errors, but they are different from yours:

## For BioC 3.3:
  Detecting features in file # 299 : 150616_POOL_IntraP_S_POS_9.mzML
  Error in checkForRemoteErrors(val) :
  one node produced an error: unknown problem while reading peaks

## For BioC 3.4:
  Detecting chromatographic peaks ... 
   % finished: 20 40 60 80 100 
   32  Peaks.
  [MSData::Spectrum::getMZIntensityPairs()] m/z array invalid size.

from xcms.

sneumann avatar sneumann commented on June 8, 2024

Then other checks would be to 1) repeat but use write.mzdata() to check other code paths in mzR 2) just read mzML data with mzR in parallel, completely eliminating xcms from the debugging. Yours, Steffen

from xcms.

jorainer avatar jorainer commented on June 8, 2024

I did read mzML files (1000 times the same file) in parallel and did not encounter any problems:

library(xcms)
library(BiocParallel)

f <- dir("/Users/jo/Desktop/", pattern = "mzML", full.names = TRUE)[1]
allf <- rep(f, 1000)
register(MulticoreParam(3, progressbar = TRUE), default = TRUE)
tmp <- bplapply(allf, function(z) {
    res <- xcmsRaw(z, profstep = 0)
    return(length(res@env$mz))
})

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

I encountered no problems with reading 432 files 48 times:

library(xcms)
library(BiocParallel)
cores <- 12
mcp <- MulticoreParam(cores, progressbar = TRUE)
show(mcp)
register(mcp, default = TRUE)

f <- dir("/home/art/storage/renata/resin-2012/centroided/resin/"
        , pattern = "mzML", full.names = TRUE)[1]
allf <- rep(f, cores*4)
tmp <- bplapply(allf, function(z) {
    res <- xcmsRaw(z, profstep = 0)
    return(length(res@env$mz))
})

from xcms.

jorainer avatar jorainer commented on June 8, 2024

I also have now problems to reproduce your error. @eschen42 Could you eventually share one of your files on which you have had the errors? My files might have too low resolution or to broad peaks...

Interestingly I do get errors (but different ones that you reported see below) when running centWave with your settings, but setting noise = 0 I don't get them. I would now like to dig a little bit deeper into the centWave code.

     32  Peaks.
    [MSData::Spectrum::getMZIntensityPairs()] m/z array invalid size.
    Error in rampSIPeaks(rampid, scans, scanHeaders$peaksCount[scans]) :
      unknown problem while reading peaks

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

Johannes,

I will email you the details.

I just reproduced this on another physical machine with an old version in Galaxy (running in Docker), so it doesn't seem to be limited to the hardware that I was using before:

Fatal error: Exit code 1 ()
arguments 'minimized' and 'invisible' are for Windows only
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"missing"’
Calls: do.call ... FUN -> do.call -> findPeaks.centWave -> <Anonymous>
Execution halted

Code called (with added line breaks that produced this error (GALAXY_SLOTS was equal to 1):

Rscript \
/shed_tools/toolshed.g2.bx.psu.edu/repos/lecorguille/xcms_xcmsset/69eb0fc05837/xcms_xcmsset/xcms.r \
zipfile /export/galaxy-central/database/files/000/dataset_12.dat \
xfunction xcmsSet \
xsetRdataOutput /export/galaxy-central/database/files/000/dataset_13.dat \
sampleMetadataOutput /export/galaxy-central/database/files/000/dataset_14.dat \
ticspdf /export/galaxy-central/database/files/000/dataset_15.dat \
bicspdf /export/galaxy-central/database/files/000/dataset_16.dat \
nSlaves ${GALAXY_SLOTS:-1} \
method centWave \
ppm 2 \
peakwidth "c(2.5,9)" \
scanrange "c(1184,6046)" \
mzdiff -0.001 \
snthresh 10 \
integrate 1 \
noise 10000 \
prefilter "c(3,100)" ; \
return=$?; \
mv log.txt /export/galaxy-central/database/files/000/dataset_17.dat; \
cat /export/galaxy-central/database/files/000/dataset_17.dat; \
sh -c "exit $return"

Package versions used

PACKAGE INFO
parallel    3.2.2
BiocGenerics    0.18.0
Biobase 2.32.0
Rcpp    0.12.2
mzR 2.4.1
xcms    1.46.0
snow    0.4.1
batch   1.1.4

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

OMG!!!

In reviewing the files that Johannes requested, I discovered that I had centroided some of the bad reads from the instrument, i.e., the files were very much smaller. Perhaps these small files can confuse xcmsSet, because, when I weeded them out before running xcmsSet, it didn't abort!

So, perhaps there's an opportunity to check for signature ‘"missing"’ in the code.

I will repeat this a few more times to confirm whether small files were the trigger for this behavior.

from xcms.

sneumann avatar sneumann commented on June 8, 2024

Even then we should have better error messages. Yours Steffen


I blame Android for the brevity and typos

from xcms.

jorainer avatar jorainer commented on June 8, 2024

@eschen42 I've now tried to run your files (excluding the small ones) with your centWave settings using BioC 3.3 (all 423 files) no error. Was it these files your were having the problems?
Running just the small ones with

myset <- xcmsSet(files = paste0(the_path, smalls), method="centWave",
                 ppm=2.5, peakwidth=c(2.5,9), mzdiff=-0.001,
                 noise=1e5, snthresh=10, nSlaves = 3)

I do get the following error(s):

Starting snow cluster with 3 local sockets.
Detecting features in file # 1 : 234.mzML 
Detecting features in file # 2 : 234_130325162902.mzML 
Detecting features in file # 3 : 235.mzML 
Detecting features in file # 4 : 240.mzML 
Detecting features in file # 5 : 250.mzML 
Detecting features in file # 6 : 250_130325165102.mzML 
Detecting features in file # 7 : 261.mzML 
Detecting features in file # 8 : 263.mzML 
Detecting features in file # 9 : 264.mzML 
Detecting features in file # 10 : 275.mzML 
Detecting features in file # 11 : 283.mzML 
Detecting features in file # 12 : 287.mzML 
Detecting features in file # 13 : 294.mzML 
Detecting features in file # 14 : 319.mzML 
Detecting features in file # 15 : 320.mzML 
Detecting features in file # 16 : 322.mzML 
Detecting features in file # 17 : 325.mzML 
Detecting features in file # 18 : 335.mzML 
Detecting features in file # 19 : 339.mzML 
Detecting features in file # 20 : 343.mzML 
Detecting features in file # 21 : 348.mzML 
Detecting features in file # 22 : 81.mzML 
Detecting features in file # 23 : Blank.mzML 

Error in checkForRemoteErrors(val) : 
  19 nodes produced errors; first error: No scales ? Please check peak width!

That error (looks to me more related to the centWave settings respectively the small files) is however not the one you've reported before (unable to find an inherited method for function 'findPeaks.centWave' for signature '"NULL"') which to me sounded that mzR was not returning the correct data and xcmsRaw did not return a valid xcmsRaw object.

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

Johannes, Thank you. Yes, I did succeed with the larger files.

I found that processing 261.mzML by itself did give the error that I reported

Fatal error: Exit code 1 ()
arguments 'minimized' and 'invisible' are for Windows only
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"missing"’
Calls: do.call ... FUN -> do.call -> findPeaks.centWave -> <Anonymous>
Execution halted

from xcms.

sneumann avatar sneumann commented on June 8, 2024

Hi, Can you read 261 with xcmsRaw() and show us the output?


I blame Android for the brevity and typos

from xcms.

sneumann avatar sneumann commented on June 8, 2024

Hi,
answering myself, 261 looks very strange:
on average one scan per 15 seconds ?
Yours, Steffen

> xr
An "xcmsRaw" object with 48 mass spectra

Time range: 8-752.2 seconds (0.1-12.5 minutes)
Mass range: 201.7885-875.5236 m/z
Intensity range: 2121330-874077000 

MSn data on  0  mass(es)
    with  0  MSn spectra
Profile method: bin 
Profile step: 1 m/z (675 grid points from 202 to 876 m/z)

Memory usage: 0.752 MB

screenshot from 2016-09-11 11-27-11

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

Yes it was a bad read.

I am leaving the issue open pending determination whether it's appropriate to fail the entire dataset went to bad file is supplied rather than produce the results of processing the good spectra and give a warning.

from xcms.

jorainer avatar jorainer commented on June 8, 2024

@sneumann what do you think of that idea? In principle it should be fairly simple to put the feature detection call inside a tryCatch.

from xcms.

jorainer avatar jorainer commented on June 8, 2024

In commit 5313459 (xcms3 branch) I've added the argument stopOnError to the xcmsSet function. Setting that option to FALSE will perform the feature detection on all provided files without stopping on eventual errors; errors are reported at the end as warnings.

from xcms.

sneumann avatar sneumann commented on June 8, 2024

I was also thinking about this, and warnings() are quite transient.
What about a slot in xcmsSet that remembers for each filePaths(xs)
or xcmsSource, if peak picking succeeded ? And some way to keep the
error message if possible. This could be shown in show.xcmsSet()
Yours, Steffen

from xcms.

jorainer avatar jorainer commented on June 8, 2024

good point. As you suggest, a solution would be to have an (internal) slot .processHistory that keeps track of e.g. the feature detection per file and whether it caused an error or not (keeping the error message in the latter case). Would be nice to keep that general and open, so that also other processing steps could be put in there too.

from xcms.

jorainer avatar jorainer commented on June 8, 2024

OK, the basic functionality to keep track of processing steps has been included (commit 5a50586 in xcms3 branch). This includes:

  • showError method on xcmsSet objects that would list all errors/files on which the feature detection failed (given that stopOnError = FALSE in the xcmsSet function).
  • show method from xcmsSet objects list eventual errors.

Example (with files from @eschen42 ):

> library(xcms)
> mzfs <- paste0("../../local_data/mzML-files/",
+                c("123.mzML", "261.mzML", "263.mzML"))
> suppressWarnings(
+     res <- xcmsSet(mzfs, method = "centWave", ppm = 2.5,
+                    peakwidth = c(2.5, 9), mzdiff = -0.001,
+                    snthresh = 10, stopOnError = FALSE)
+ )
> show(res)
An "xcmsSet" object with 3 samples

Time range: 10.7-1228.8 seconds (0.2-20.5 minutes)
Mass range: 200.8049-988.1491 m/z
Peaks: 10647 (about 3549 per sample)
Peak Groups: 0 
Sample classes: mzML-files 

Feature detection:
 o Peak picking performed on MS1.
 o Detection errors: 2 files failed.
   Use method 'showError' to list the error(s).

Profile settings: method = bin
                  step = 0.1

Memory usage: 1.28 MB
>
> showError(res)
[[1]]
[1] "Error identifying features in '261.mzML': Error in .local(object, ...): No scales ? Please check peak width!\n\n"

[[2]]
[1] "Error identifying features in '263.mzML': Error in .local(object, ...): No scales ? Please check peak width!\n\n"

from xcms.

eschen42 avatar eschen42 commented on June 8, 2024

This work sounds awesome. Thank you!!

On 09/19/2016 07:08 AM, Johannes Rainer wrote:

good point. As you suggest, a solution would be to have an (internal)
slot |.processHistory| that keeps track of e.g. the feature detection
per file and whether it caused an error or not (keeping the error
message in the latter case). Would be nice to keep that general and
open, so that also other processing steps could be put in there too.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ARweCf-cyw6JEWZCKAOf1U8LqW-Jqpzvks5qrntNgaJpZM4J1_ao.

Art Eschenlauer
[email protected]

from xcms.

jorainer avatar jorainer commented on June 8, 2024

Closing this issue for now. The original error could well be related to some nasty internal problem in the C code of mzR (or actually to C libraries within). From time to time this error also shows up in MSnbase's readMSData method.

from xcms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.