Comments (26)
That reminds me of random errors we were getting in MSnbase
(issues lgatto/MSnbase#151 and lgatto/MSnbase#138). The latter were clearly caused by memory-allocating problems in the C-code (more specifically, the garbage collector freeing memory that was still used by the C-code). The former by some yet unresolved bug/problem in mzR
.
I'll check into this; due to it's randomness I'm suspecting memory problems. I'm suspecting here some C-code the mzR
package (used for reading the files).
Did you get the same when running it without parallel processing?
from xcms.
Indeed, I do get the same error without parallel processing, both for devel and prod 3.3.
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"missing"’
Calls: xcmsSet ... FUN -> do.call -> findPeaks.centWave ->
I notice that when parallel processing there are as many separate processes running as computational threads, so it is not surprising that without parallelization I get the same outcome.
I will try again (with parallelization) on my own Ubuntu machine - less power, but we will see what the outcome is - I don't know for sure that there isn't a hardware problem on the virtualization host that hosts the Windows and Debian guests where I have experienced the problem.
Update: I never could get mzR to install on my Ubuntu box (running R from a conda environment), whether from devel or prod 3.3. biocLite("mzR") always failed with:
** testing if installed package can be loaded
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/zpdrone/drone/home/art/R/x86_64-pc-linux-gnu-library/3.3/mzR/libs/mzR.so':
/usr/lib/x86_64-linux-gnu/libhogweed.so.4: undefined symbol: __gmpn_cnd_sub_n
Error: loading failed
Execution halted
ERROR: loading failed
Searching on the web failed to turn up any useful alternatives.
from xcms.
I have implemented a new import function that uses (presumably) newer code from the mzR
package to read MS files, i.e. uses mzR::header
and mzR::peaks
to read the data.
Could you please check if this fixes your problems (I'll run tests in parallel too)?
To install:
library(devtools)
install_github("sneumann/xcms", ref = "xcms3")
library(xcms)
## Enable using the new code:
useOriginalCode(FALSE)
The useOriginalCode(FALSE)
is important, as only this enables the new code.
from xcms.
Johannes,
Please confirm with this log process-2012-resin-full-xcms3-multicore.log.txt that I ran the experiment as I should have; this was with MultiCoreParam and ran with four forked processes.
Unfortunately, again, this worked on a small scale but was unsuccessful when I gave it many files to work with as four threads.
I will retry with SerialParam instead of MultiCoreParam and report the results.
Update: after about 162 ROIs and Peaks I got:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"NULL"’
Calls: xcmsSet ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
Execution halted
I now will try Steffan's suggestion to try a large set of netCDF files.
from xcms.
Hi, to check that it is not mzR I suggest to try a large set of netCDF files.
One can convert to netCDF with the code below. Caveat, only xcms can read this netCDF flavour,
other tools complain. Yours, Steffen
library(xcms)
library(mtbls2)
data(mtbls2)
dummy <- sapply(filepaths(mtbls2Set), function(x) {write.cdf(xcmsRaw(system.file(x, package="mtbls2")), file=paste("/tmp/", basename(x), ".cdf", sep=""))})
from xcms.
Steffan,
The test passed:
netCDF-test.log.txt
from xcms.
Interesting.
For reading mzML files we can choose in mzR
to use either the Ramp
backend or the pwiz
backend, I'll check for both.
Just an update on the errors: on my files (about 350) I don't get any errors (neither with BioC 3.3 nor with BioC 3.4) when I use the centWave
's default settings. When I use your settings I get errors, but they are different from yours:
## For BioC 3.3:
Detecting features in file # 299 : 150616_POOL_IntraP_S_POS_9.mzML
Error in checkForRemoteErrors(val) :
one node produced an error: unknown problem while reading peaks
## For BioC 3.4:
Detecting chromatographic peaks ...
% finished: 20 40 60 80 100
32 Peaks.
[MSData::Spectrum::getMZIntensityPairs()] m/z array invalid size.
from xcms.
Then other checks would be to 1) repeat but use write.mzdata()
to check other code paths in mzR 2) just read mzML data with mzR in parallel, completely eliminating xcms from the debugging. Yours, Steffen
from xcms.
I did read mzML files (1000 times the same file) in parallel and did not encounter any problems:
library(xcms)
library(BiocParallel)
f <- dir("/Users/jo/Desktop/", pattern = "mzML", full.names = TRUE)[1]
allf <- rep(f, 1000)
register(MulticoreParam(3, progressbar = TRUE), default = TRUE)
tmp <- bplapply(allf, function(z) {
res <- xcmsRaw(z, profstep = 0)
return(length(res@env$mz))
})
from xcms.
I encountered no problems with reading 432 files 48 times:
library(xcms)
library(BiocParallel)
cores <- 12
mcp <- MulticoreParam(cores, progressbar = TRUE)
show(mcp)
register(mcp, default = TRUE)
f <- dir("/home/art/storage/renata/resin-2012/centroided/resin/"
, pattern = "mzML", full.names = TRUE)[1]
allf <- rep(f, cores*4)
tmp <- bplapply(allf, function(z) {
res <- xcmsRaw(z, profstep = 0)
return(length(res@env$mz))
})
from xcms.
I also have now problems to reproduce your error. @eschen42 Could you eventually share one of your files on which you have had the errors? My files might have too low resolution or to broad peaks...
Interestingly I do get errors (but different ones that you reported see below) when running centWave
with your settings, but setting noise = 0
I don't get them. I would now like to dig a little bit deeper into the centWave
code.
32 Peaks.
[MSData::Spectrum::getMZIntensityPairs()] m/z array invalid size.
Error in rampSIPeaks(rampid, scans, scanHeaders$peaksCount[scans]) :
unknown problem while reading peaks
from xcms.
Johannes,
I will email you the details.
I just reproduced this on another physical machine with an old version in Galaxy (running in Docker), so it doesn't seem to be limited to the hardware that I was using before:
Fatal error: Exit code 1 ()
arguments 'minimized' and 'invisible' are for Windows only
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"missing"’
Calls: do.call ... FUN -> do.call -> findPeaks.centWave -> <Anonymous>
Execution halted
Code called (with added line breaks that produced this error (GALAXY_SLOTS was equal to 1):
Rscript \
/shed_tools/toolshed.g2.bx.psu.edu/repos/lecorguille/xcms_xcmsset/69eb0fc05837/xcms_xcmsset/xcms.r \
zipfile /export/galaxy-central/database/files/000/dataset_12.dat \
xfunction xcmsSet \
xsetRdataOutput /export/galaxy-central/database/files/000/dataset_13.dat \
sampleMetadataOutput /export/galaxy-central/database/files/000/dataset_14.dat \
ticspdf /export/galaxy-central/database/files/000/dataset_15.dat \
bicspdf /export/galaxy-central/database/files/000/dataset_16.dat \
nSlaves ${GALAXY_SLOTS:-1} \
method centWave \
ppm 2 \
peakwidth "c(2.5,9)" \
scanrange "c(1184,6046)" \
mzdiff -0.001 \
snthresh 10 \
integrate 1 \
noise 10000 \
prefilter "c(3,100)" ; \
return=$?; \
mv log.txt /export/galaxy-central/database/files/000/dataset_17.dat; \
cat /export/galaxy-central/database/files/000/dataset_17.dat; \
sh -c "exit $return"
Package versions used
PACKAGE INFO
parallel 3.2.2
BiocGenerics 0.18.0
Biobase 2.32.0
Rcpp 0.12.2
mzR 2.4.1
xcms 1.46.0
snow 0.4.1
batch 1.1.4
from xcms.
OMG!!!
In reviewing the files that Johannes requested, I discovered that I had centroided some of the bad reads from the instrument, i.e., the files were very much smaller. Perhaps these small files can confuse xcmsSet, because, when I weeded them out before running xcmsSet, it didn't abort!
So, perhaps there's an opportunity to check for signature ‘"missing"’
in the code.
I will repeat this a few more times to confirm whether small files were the trigger for this behavior.
from xcms.
Even then we should have better error messages. Yours Steffen
I blame Android for the brevity and typos
from xcms.
@eschen42 I've now tried to run your files (excluding the small ones) with your centWave
settings using BioC 3.3 (all 423 files) no error. Was it these files your were having the problems?
Running just the small ones with
myset <- xcmsSet(files = paste0(the_path, smalls), method="centWave",
ppm=2.5, peakwidth=c(2.5,9), mzdiff=-0.001,
noise=1e5, snthresh=10, nSlaves = 3)
I do get the following error(s):
Starting snow cluster with 3 local sockets.
Detecting features in file # 1 : 234.mzML
Detecting features in file # 2 : 234_130325162902.mzML
Detecting features in file # 3 : 235.mzML
Detecting features in file # 4 : 240.mzML
Detecting features in file # 5 : 250.mzML
Detecting features in file # 6 : 250_130325165102.mzML
Detecting features in file # 7 : 261.mzML
Detecting features in file # 8 : 263.mzML
Detecting features in file # 9 : 264.mzML
Detecting features in file # 10 : 275.mzML
Detecting features in file # 11 : 283.mzML
Detecting features in file # 12 : 287.mzML
Detecting features in file # 13 : 294.mzML
Detecting features in file # 14 : 319.mzML
Detecting features in file # 15 : 320.mzML
Detecting features in file # 16 : 322.mzML
Detecting features in file # 17 : 325.mzML
Detecting features in file # 18 : 335.mzML
Detecting features in file # 19 : 339.mzML
Detecting features in file # 20 : 343.mzML
Detecting features in file # 21 : 348.mzML
Detecting features in file # 22 : 81.mzML
Detecting features in file # 23 : Blank.mzML
Error in checkForRemoteErrors(val) :
19 nodes produced errors; first error: No scales ? Please check peak width!
That error (looks to me more related to the centWave
settings respectively the small files) is however not the one you've reported before (unable to find an inherited method for function 'findPeaks.centWave' for signature '"NULL"'
) which to me sounded that mzR
was not returning the correct data and xcmsRaw
did not return a valid xcmsRaw
object.
from xcms.
Johannes, Thank you. Yes, I did succeed with the larger files.
I found that processing 261.mzML by itself did give the error that I reported
Fatal error: Exit code 1 ()
arguments 'minimized' and 'invisible' are for Windows only
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘findPeaks.centWave’ for signature ‘"missing"’
Calls: do.call ... FUN -> do.call -> findPeaks.centWave -> <Anonymous>
Execution halted
from xcms.
Hi, Can you read 261 with xcmsRaw() and show us the output?
I blame Android for the brevity and typos
from xcms.
Hi,
answering myself, 261 looks very strange:
on average one scan per 15 seconds ?
Yours, Steffen
> xr
An "xcmsRaw" object with 48 mass spectra
Time range: 8-752.2 seconds (0.1-12.5 minutes)
Mass range: 201.7885-875.5236 m/z
Intensity range: 2121330-874077000
MSn data on 0 mass(es)
with 0 MSn spectra
Profile method: bin
Profile step: 1 m/z (675 grid points from 202 to 876 m/z)
Memory usage: 0.752 MB
from xcms.
Yes it was a bad read.
I am leaving the issue open pending determination whether it's appropriate to fail the entire dataset went to bad file is supplied rather than produce the results of processing the good spectra and give a warning.
from xcms.
@sneumann what do you think of that idea? In principle it should be fairly simple to put the feature detection call inside a tryCatch
.
from xcms.
In commit 5313459 (xcms3 branch) I've added the argument stopOnError
to the xcmsSet
function. Setting that option to FALSE
will perform the feature detection on all provided files without stopping on eventual errors; errors are reported at the end as warnings
.
from xcms.
I was also thinking about this, and warnings() are quite transient.
What about a slot in xcmsSet that remembers for each filePaths(xs)
or xcmsSource, if peak picking succeeded ? And some way to keep the
error message if possible. This could be shown in show.xcmsSet()
Yours, Steffen
from xcms.
good point. As you suggest, a solution would be to have an (internal) slot .processHistory
that keeps track of e.g. the feature detection per file and whether it caused an error or not (keeping the error message in the latter case). Would be nice to keep that general and open, so that also other processing steps could be put in there too.
from xcms.
OK, the basic functionality to keep track of processing steps has been included (commit 5a50586 in xcms3 branch). This includes:
showError
method onxcmsSet
objects that would list all errors/files on which the feature detection failed (given thatstopOnError = FALSE
in thexcmsSet
function).show
method fromxcmsSet
objects list eventual errors.
Example (with files from @eschen42 ):
> library(xcms)
> mzfs <- paste0("../../local_data/mzML-files/",
+ c("123.mzML", "261.mzML", "263.mzML"))
> suppressWarnings(
+ res <- xcmsSet(mzfs, method = "centWave", ppm = 2.5,
+ peakwidth = c(2.5, 9), mzdiff = -0.001,
+ snthresh = 10, stopOnError = FALSE)
+ )
> show(res)
An "xcmsSet" object with 3 samples
Time range: 10.7-1228.8 seconds (0.2-20.5 minutes)
Mass range: 200.8049-988.1491 m/z
Peaks: 10647 (about 3549 per sample)
Peak Groups: 0
Sample classes: mzML-files
Feature detection:
o Peak picking performed on MS1.
o Detection errors: 2 files failed.
Use method 'showError' to list the error(s).
Profile settings: method = bin
step = 0.1
Memory usage: 1.28 MB
>
> showError(res)
[[1]]
[1] "Error identifying features in '261.mzML': Error in .local(object, ...): No scales ? Please check peak width!\n\n"
[[2]]
[1] "Error identifying features in '263.mzML': Error in .local(object, ...): No scales ? Please check peak width!\n\n"
from xcms.
This work sounds awesome. Thank you!!
On 09/19/2016 07:08 AM, Johannes Rainer wrote:
good point. As you suggest, a solution would be to have an (internal)
slot |.processHistory| that keeps track of e.g. the feature detection
per file and whether it caused an error or not (keeping the error
message in the latter case). Would be nice to keep that general and
open, so that also other processing steps could be put in there too.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ARweCf-cyw6JEWZCKAOf1U8LqW-Jqpzvks5qrntNgaJpZM4J1_ao.
Art Eschenlauer
[email protected]
from xcms.
Closing this issue for now. The original error could well be related to some nasty internal problem in the C code of mzR
(or actually to C libraries within). From time to time this error also shows up in MSnbase
's readMSData
method.
from xcms.
Related Issues (20)
- Trouble extracting chromatogram plots after manualChromPeaks() HOT 5
- Seeking advices about data importing HOT 5
- Suggested starting parameters for GC-MS data HOT 3
- reconstructChromPeakSpectra HOT 1
- Filter MsExperiment object by spectraData filterString HOT 16
- Bad practices with peak merging HOT 2
- msLevel=2L chromatograms (MRM or SRM) not working with MsExperiment interface HOT 1
- precusor charge for each features HOT 3
- fillChromPeaks error HOT 2
- Trouble plotting XChromatograms HOT 5
- R CMD check problems HOT 5
- xset crashes on MAC OS 14.3 R 4.3.3 , BiocManager 1.30.22 and xcms 4.0.2 HOT 6
- Update retention times to alkane retention time indices for GC-MS HOT 5
- The error message encountered when using the "IPO" package HOT 10
- Error for XIC plot of timsTof data HOT 3
- Best way to document additional XCMS object processing steps in the ProcessHistory slot? HOT 3
- After I perform chromatographic peak detection, how do I confirm the quality of my peaks? HOT 1
- Ways to extract all MS2 peaks? HOT 3
- Best way to deconvolute metabolite data HOT 1
- Waters data problems with Obiwarp HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xcms.