wkumler / rams Goto Github PK
View Code? Open in Web Editor NEWR-based access to Mass-Spectrometry data
License: Other
R-based access to Mass-Spectrometry data
License: Other
Checklist for RaMS version 1.4 below:
Realized today that m/z group construction could be done with a 1D density-based clustering algorithm like DBSCAN or OPTICS. Perks of this would be that the "hard" m/z window currently used by mz_group
would be relaxed and could be determined in a more data-driven method.
There's a paper about this exact idea: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3982975/ and they talk about reducing the computational constraints through some clever preprocessing, necessary because the current implementation takes a long while for just 6 files.
Quick proof-of-concept:
library(RaMS)
ms_filedir <- system.file("extdata", package="RaMS")
ms_files <- list.files(ms_filedir, pattern="LB.*mzML", full.names=TRUE)
msdata <- grabMSdata(ms_files)
library(dbscan)
mz_groups <- dbscan(msdata$MS1[,"mz"], eps = 0.0001, minPts = 100)
msdata$MS1$mz_group <- mz_groups$cluster
library(ggplot2)
msdata$MS1[mz%between%c(110, 130)] %>%
ggplot() +
geom_point(aes(x=rt, y=mz, color=factor(mz_group)))
Hi @wkumler,
Thanks for the nice package, it is very user-friendly and fast.
I am wondering if you could include a function to grab ion injection time
?
I understand that such parameter is not universally present for all types of MS files. It could be useful for some, i.e., files generated by Orbitrap.
Thanks again.
Dong
At the R Cascadia conference this past weekend, Cari Gostic gave an excellent talk on the interface between tidy data and the arrow
package, which handles input/output from Apache Arrow parquet files and datasets. This seems to be a direct upgrade to the tmzML document type and is at least an order of magnitude faster in both creation and retrieval.
Notes:
write_dataset
and read_dataset
plus collect
from the dplyr
package.dplyr
commands can be passed directly to an open dataset object but computations are trickier
mutate(samp_type=str_extract(filename, "Blk|175m|15m|DCM|Poo|Std"))
needs to be done in Rfilter(mz%between%pmppm(76.039854+1.003355, 5))
computation needs to be done in R (why does pmppm
work but not simple addition?)str_detect
seems to be very slow if called before collect
? E.g. filter(str_detect(filename, "Smp"))
Hi,
I was wondering if you would be interested in incorporating support for a function to extract DAD data from mzML files? I have some example files that were converted from the Thermo RAW format using the ThermoRawFileParser that encode both MS and DAD data. I figured out how to extract them using the MzR package, but I think I like you're approach better since it doesn't come with so many dependencies. If you want I could try to throw together a grabMzmlUV
function and/or send over some example files.
Thanks!
while enjoying the easiness of RaMS, I have a request for a possible new feature.
mzML allows to store chromatograms and when MS data files with data acquired in MRM (from e.g. Sciex format .wiff) is converted to mzML (via e.g. ProteoWizard) the run results are stored as chromatograms not spectra. It is possible to convert chromatograms to spectra during conversion to mzML in ProteoWizard but repeating isolation mz values become an issue. I was wondering if other chromatograms (besides TIC and BPC) in mzML could be obtained as a new functionality in RaMS. Information for each chromatogram (i.e., name, precursor mz and isolation mz) are present and should be returned with the chromatogram data. Yet, I do not know exactly where is stored in the mzML.
Currently, the error below is returned when MRM data is kept as chromatograms.
Error in UseMethod("xml_find_first") :
no applicable method for 'xml_find_first' applied to an object of class "xml_missing"
By converting chromatograms to spectra, the MRM data can be obtained as normal "MS1".
You find attached an example mzML data file with MRM from Nitrosamines as chromatograms.
Example_MRM_Nitrosamines.zip
Please let me know if you have further questions regarding the request/idea.
Thank you in advance for the consideration and continue with the good work.
Ricardo
Apparently .wiff2 files incorrectly return empty data.tables. Using grabMSdata
on them causes the object returned to have MS1, MS2, BPC, and TIC data.tables of zero rows.
Given that RaMS returns data.tables and I often then use data.table functions on them, it may make sense to automatically attach the data.table functions. Of course, this is very rarely recommended and I'm not convinced that it's worth it here but wanted to document it as a potential enhancement.
#' See the package intro on GitHub at https://github.com/wkumler/RaMS and
#' explore the vignettes with \code{vignette("help", package = "mypkg")}
Forgot to include the 1.3.5 details in the NEWS document :( will fix in the next version
Found this error while trying to open direct injection DOM data in RaMS. Makes sense because the single node (accumulation across the entire time of the direct injection) is basically the entire 100MB file. Apparently it's DoS protection but hopefully the MS files are trustworthy.
grabMSdata(system.file("extdata", "DDApos_2.mzML.gz", package="RaMS"), grab_what = "MS2")$MS2[premz%between%pmppm(118.0865)]
produces
rt premz fragmz int voltage filename
1: 4.182333 118.0864 51.81098 3809.649 35 DDApos_2.mzML.gz
2: 4.182333 118.0864 58.06422 10133.438 35 DDApos_2.mzML.gz
3: 4.182333 118.0864 58.06590 390179.500 35 DDApos_2.mzML.gz
4: 4.182333 118.0864 59.07371 494165.156 35 DDApos_2.mzML.gz
5: 4.182333 118.0864 59.56195 4696.181 35 DDApos_2.mzML.gz
---
584: 14.897500 118.0865 115.38483 2501.394 35 DDApos_2.mzML.gz
585: 14.897500 118.0865 118.08650 5328035.500 35 DDApos_2.mzML.gz
586: 14.897500 118.0865 118.12283 59140.699 35 DDApos_2.mzML.gz
587: 14.897500 118.0865 119.08417 9048.057 35 DDApos_2.mzML.gz
588: 14.897500 118.0865 119.08983 161270.016 35 DDApos_2.mzML.gz
but
tmzmlMaker(system.file("extdata", "DDApos_2.mzML.gz", package="RaMS"), output_filename = "~/../Desktop/test_tmz.tmzML")
grabMSdata("~/../Desktop/test_tmz.tmzML")$MS2[premz%between%pmppm(118.0865)]
produces
rt premz fragmz voltage int filename
1: 4.182333 118.0864 51.81098 1.72923e-322 3809.649 test_tmz.tmzML
2: 4.182333 118.0864 58.06422 1.72923e-322 10133.438 test_tmz.tmzML
3: 4.182333 118.0864 58.06590 1.72923e-322 390179.500 test_tmz.tmzML
4: 4.182333 118.0864 59.07371 1.72923e-322 494165.156 test_tmz.tmzML
5: 4.182333 118.0864 59.56195 1.72923e-322 4696.181 test_tmz.tmzML
---
584: 14.897500 118.0865 115.38483 1.72923e-322 2501.394 test_tmz.tmzML
585: 14.897500 118.0865 118.08650 1.72923e-322 5328035.500 test_tmz.tmzML
586: 14.897500 118.0865 118.12283 1.72923e-322 59140.699 test_tmz.tmzML
587: 14.897500 118.0865 119.08417 1.72923e-322 9048.057 test_tmz.tmzML
588: 14.897500 118.0865 119.08983 1.72923e-322 161270.016 test_tmz.tmzML
Pretty clearly an encoding error but I'm not sure where exactly it's coming from, will look into soon.
There's a couple functions that I find myself repeatedly needing to use alongside RaMS that should really just be included in the package. I'm calling these "convenience" functions because I don't really need them but they're convenient to have pre-written (and unit-tested) and this'll make it easier to share them. These would be much like the existing pmppm
function that I've found very useful.
The first is one for integrating raw MS data using trapezoidal Riemann sums. This is the core step in moving from the mz/rt/int matrix data frame to a feature/area data frame. The general consensus seems to be that trapezoidal integration is the way to go since that 1) nicely handles the uneven spacing between retention times and 2) calculates the exact area under the data points. I quibble with this a little bit and wonder whether an absolute sum would be more accurate/precise because 1) the trapezoidal rule technically underestimates the area under the curve because the chromatographic peaks are mostly concave-down, 2) the detector is in fact measuring counts that could be directly summed, and 3) trapezoids can cause falsely inflate the signal of low-quality peaks by linearly interpolating from a single spike across to the next low data point. But I'm in the minority here so I think the default of trapezoidal integration is still the way to go. I currently just use some code stolen from pracma::trapz
but I'd like to do some more careful handling of this. The function should just take the rt/int values and the user should be responsible for filtering those out ahead of time.
The second one is a more heuristic function I wrote because I wanted to pull out EICs from the raw data by grouping things in m/z space. When I extract a large m/z window, I often end up with multiple clear chromatograms in m/z space that can clearly be grouped but the only obvious way to do this is to specify repeated ifelse
or a case_when
with lots of m/z cutoffs. Instead, we can use the algorithm currently implemented in ADAP and start with the largest intensity mass in the data, identify an m/z window around it, then group all of those points into a single mean value. We then remove those points from consideration and repeat the process until there's no points left. The tricky part was figuring out how to do this in a tidy fashion but I think I'm happy with the implementation that returns a vector in the same order as the data with integer values corresponding to the EICs detected. This then can be passed neatly into a mutate
statement to create a new column (usually called mz_group
) which I can then group or color by. More advanced functions could accept the RTs as well and do some clever ROI detection but I think I'd rather leave this one simple.
The final one is a plotting option - I originally avoided including this because I wanted people to learn/use the actual ggplot2
syntax but my poor fingers are tired of typing the exact same ggplot() + geom_line(aes(x=rt, y=int, group=filename))
over and over again. I think instead I'm going to add a qplotMSdata
function that is basically just that line of code. I think it would also be neat to have arguments for color=
and facet=
because I often color/facet by additional columns too. Unfortunately, adding ggplot2
would add a bunch of dependencies and I'm trying to keep RaMS lightweight so I think it'll be a mimic for now. I wonder if I can add ggplot2
code and only run it if ggplot2
is already loaded or if that would cause issues with CRAN checks?
First release:
usethis::use_cran_comments()
Title:
and Description:
@returns
and @examples
Authors@R:
includes a copyright holder (role 'cph')Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
Checked, but get PREPERROR on Linux build due to libxml2 library not foundSubmit to CRAN:
usethis::use_version('major')
(Performed manually)devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
I keep getting burned by non-mzML files included in the pass to grabMSdata, but I only find out about it after I've sunk a bunch of time into extracting all the actual mzMLs first until it hits one and crashes. Two solutions for later: one, check the filetype before loading any of the files which is simple enough to do by checking the extension but might throw some false positives; or two, throw a warning that the file type isn't recognized and ignore it. Adding to the v1.4 milestone maybe?
I wonder if it would be possible to add an option to grabMSdata
that constructs the tmzMLs on the fly rather than making it a separate step. Instead of reading the mzMLs directly into memory, the option (as_tmzML = TRUE
?) would instead convert the files to tmzML in a temporary directory then construct and return the tmzML object. Given the (intentional) similarities between the two types I wonder if it's possible to streamline this because I often find myself held back by the initial tmzML construction step and end up spending more time waiting for the files to load repeatedly. This could also be enabled as an option if memory limits are approached - if the total size of the files to be loaded exceeds, say, a quarter of the system's RAM, it could throw a warning and suggest using as_tmzML = TRUE
.
Expected issues include:
on.exit
seems like it requires an active function but maybe there's an equivalent for when the R session ends overall? How to handle R crashing? This could be a major issue because mass spec files are big and could easily clog up a user's system if not cleared out regularly.
as_tmzML
folder is requested?glimpse
functionality...I've been using RaMS for about a year and it is amazing; finally an easy depencency-light way to fast reading and tidy manipulation of MS data! My feature request is could there be a way to label different types of MS1 scans acquired in the same experiment. (I often acquire data like this when varying ion source parameters.)
I have *.mzML
files generated from Sciex *.wiff2
files that were acquired from a qTOF that was running multiple MS1 scan types. Sciex calls the different scan types "experiments" and in the XML (generated via ProteoWizard) these different scan types are referred to like this:
<spectrum index="0" id="sample=1 period=1 cycle=1 experiment=2" defaultArrayLength="2271">
[...]
<spectrum index="1" id="sample=1 period=1 cycle=1 experiment=4" defaultArrayLength="4300">
[...]
<spectrum index="3" id="sample=1 period=1 cycle=1 experiment=7" defaultArrayLength="3">
I'd be happy to supply an example mzML file.
I imagine one output type might be an extra column (relative to what get_what = c('MS1')
returns) containing the spectrum id strings like sample=1 period=1 cycle=1 experiment=4
.
so "streaming" files no longer works - looks like they switched over to an FTP system and no longer expose the raw URLs to the browser. The README needs updating as well as any examples that do this (grabMSdata? others?)
grabMzxmlBPC <- function(xml_data, TIC=FALSE, rtrange, incl_polarity){
scan_nodes <- xml2::xml_find_all(
xml_data, '//d1:scan[@msLevel="1"]'
)
rt_chrs <- xml2::xml_attr(scan_nodes, "retentionTime")
rt_vals <- as.numeric(gsub(pattern = "PT|S", replacement = "", rt_chrs))
if(any(rt_vals>150)){rt_vals <- rt_vals/60}
int_attr <- ifelse(TIC, "totIonCurrent", "basePeakIntensity")
int_vals <- as.numeric(xml2::xml_attr(scan_nodes, int_attr))
if(!is.null(rtrange)){
int_vals <- int_vals[rt_vals%between%rtrange]
rt_vals <- rt_vals[rt_vals%between%rtrange]
}
return(data.table(rt=rt_vals, int=int_vals))
}
Should instead be the method that was patched in another PR I don't have the brainspace to find right now
Hi,
The mzML files written by OpenChrom are missing a namespace declaration that is required for RaMS to parse the files correctly. It throws an error: Error in xml_find_first.xml_node(xml_data, paste0("//d1:", node_to_check)) : xmlXPathEval: evaluation failed".
If I manually add the namespace declaration to the file, it parses fine.
I'm not sure what the best fix would be here. I'm not too familiar with mzML so I don't know if these namespace declarations are "required" for the mzML files to be valid. In any case I am wondering if there is a way to add a check in RaMS for the namespace declaration and maybe modify the xpaths accordingly so that RaMS can read the OpenChrom files without modification. Alternatively, I guess it would be possible to add the namespace declaration to the XML after importing it to R. I'm not sure which way would be a better approach here?
I'm attaching an example file that was produced by OpenChrom (with the .txt extension because GitHub wouldn't let me upload mzML).
alkanes 0.25 mg-ml_2-10-2021.txt
Thanks!
Ethan
The DDA file from the Skyline folks in their DIA tutorial throws a bunch of warnings when run. Prying into these revealed that the mzXML file sometimes has an encoding of "none" and sometimes has an encoding of zlib (aka gzip).
Scan 1: none
Scan 2: zlib
Scan 3: none
Scan 4: none
Scan 5: none
Scan 6: none
Scan 7: zlib
with no clear pattern. I had always assumed that the same encoding would be applied to every binary encoding (and RaMS only reads out the first peak's value). I'll have to switch it over to individual encodings which is super annoying.
<scan num="1" msLevel="1" peaksCount="17" polarity="+" scanType="Full" filterLine="FTMS + p NSI Full ms [400.00-2000.00]" retentionTime="PT0.1821S" injectionTime="PT0.1000S" lowMz="423.914" highMz="1765.56" basePeakMz="1765.56" basePeakIntensity="779.182" totIonCurrent="10071.9">
<peaks precision="32" byteOrder="network" contentType="m/z-int" compressionType="none" compressedLen="0">Q9P1BUPwxShD3HUiRAlb1EPe6yBD+KoQRAOuX0PvpUdEBuBDRAdZKkQH489EAu6lRAmIZEQfVKJEGxU5RBL2OUQj+/pEIzpVRDzChUQky15EXYj5RAlMgERrSNREDDeVRHfOj0QNYvNEfzuWRDAJVESVvzREGRwVRK8DZUQsofNE3LIIRELLow==</peaks>
</scan>
<scan num="2" msLevel="1" peaksCount="352" polarity="+" scanType="Full" filterLine="FTMS + p NSI Full ms [400.00-2000.00]" retentionTime="PT0.6401S" injectionTime="PT0.1000S" lowMz="400.251" highMz="1753.41" basePeakMz="445.117" basePeakIntensity="2.62395e+006" totIonCurrent="1.77468e+007">
<peaks precision="32" byteOrder="network" contentType="m/z-int" compressionType="zlib" compressedLen="2622">eJwN1vlfzWkfx/GcFnXqtKAmKqQI2UY0JrJ93tf1PaftFJP1niTLhAnNiDv3GLK1nTqRUBNZk5GkRbJlG057Km2kkjZLU9ZKy+2n1+P5H7xINXoi3/bDDFJN2s6uyB+TalcCN3ObQ6pT63Bk8G1SJZfhWFwMqfoOsaT9CZS7c4CtNhVR7q6xaLZwotyvRthgOYvydh5GkEJJeRFOXD7bjfISRfxobyPlpfewhXNXUr7Ehy3dkkr547awxzeaKD/gLA5tEFO+ohdHeB3lZ7Zhl8dMKsh6jpD9elQoXYqfxfepMCZBCGCnqDC1Gxs8t1Nh2nlWoN1GRSYWPNA+j4pSNFhlzU4qulKHhv3mVDyFIdBXl4r3ewhrVUoq0XDiR4sLqCQog7m1r6aSfX3cvuU6lYSlsev631PJ+VY0PE6mJxrq7KBlID0RJyEhbhI92S3iVavK6Mk+d8TlCPQkSV16s0dJTzJrEPiqk54MiLh2+xcqHe8vtMp6qPTPWNZ+JpxKk8r5nj+6qTTrLd48r6LS/jJ8WGVPZZbZzGdBKZVNmMlc3e5S2Q0R81nsQuUXtkPlfI/KU4ayCMer9HTKdpRP1aOnCR8Qtmo3PU014B/yMujp1Q8oun2WKqYlsQbJIqrwz5KJo/2oIuyrTPNXRhUJW+Fv5UkV760ETx8zqtTVkC78okOVHptx+8B3VOkvl/3p6EmVIUuFaQFrqDIsQfjQ3kWVndlC8QM9qtJezhfbelGVOEP4bvUBqtraKnP17qCqAxe49solVBXcyU1zs6gqbD4fNOccVUVZI66rkaou9gh+ojNUldWL0Ht2VNW5WEh89pmqNVW4ZexH1YPv4OFrXaoWH0CO3J+qJ3rx1Ta+VL11rzSl/5uV5TjsakrV2S047jqWqjs6Bb2fd1PNVnPhQvYFqsmah8tLa6jmZg+7clJENR0nuNGjWnq2pYzPW5xGzzrcsD5oMj1P00W6ZwfVHkoURB4nqTZDgwXsNaQXRmncyIDohZ8L+29wBb3YsgVFK5vpRWS302gbB3pxJYsVz9hIL96F4vKs+1RnoCGLuLuN6qbeQNnoLqpTrBY2nGigushjsrGNi6juUqv0bE0u1eu58v8Z7qJ6/d3S6Ws9qH7SEuHhL01Ub3cVmzbLqT70KpsyrpTqw5u5d0A41UfOF+xeiag+WcUDZFnUoPMcvtYO1KD7luHB39Sg78Ht445Rw9i9qAv+Sg2TLfBl1StqiHiJ/Iyh9PIvBVf9HUgv2+wRnFFNjWYvuYmBNzWmN+HoMRW9iqnAwwmf6NW5fI77f1DT2MlMrq5LTTHxPDl2BjUbF3CLm98auAPHpg5Qc4wnS/Jqp+bLH7npL97UMsRTmLOtl1qmbMSXO7HUEsVkmiuyqSVNgYOZN6jV0F3qWrmcWpUNQk+lH7XmVKFZ6xS1GZQJ+imrqU35Gx9+5yi9zuxn6zNy6M2xOgRNLqR3e3RYzqPj1K62AKX2SmofLmG/GY2i9t0PmW/LSmqPnY70R+b074Q0tqJ2FHVkvkHYA0vqvK7Jrfzr6f2sYPTVjKCPB9O4us08+qR1mx2fVk+f9nXjhH84fTq4G2+2jqMvp2NZYnU/dVl2o2JdHHWlnkLFqEPUlf2etwT/RN0OPnxkYCF138hFSHY9fY2QCpWmVvQ1YwhbFvmOeiXLhAXXBlGvqy/+9L9HvWHxgmXkaeqNkPCvQ65Tb/QedmRtAvWJb/CQW87UJ7Fmt9cFUt/QJ8zVyZb6Qobio3ET9YUt5dNb7ahPsQQv//OM+qKHIHyhGfWL3VhMWzT1h2ZjV1oEDcQqmG/pFRq42SuM8X4MtZG2fHW2Empe3vhHqMMgWT9PiZ+BQaH9XC3RDINU52XOsgoMqrXAif5EiLRTpY8qN0Ek3Sn1fxgFUbAT8zAbDFFIoHDFzAGif/5li/zvQPTYjIn7cyBSyXl72Amoa09kFY2VUPdqZ6pKB6gnWOOO3nRouG7lWaG3oRGxh1XMfwCNwrnspbsXNIqiEPd9EzQlTogfroSm82dhoNEUmgo1IbRxAjTzEtjOy1ehmV8gDO3+AVriFfwF7KClW84Pnv4ILdki9CbsgpbTcs7tvtk5hf1xfz20wqaz9DsHoRVejovzMqGV54Doe0OhswDImb8DOhHX2Lt5b6BTeJIFnboNsUSJnm1rIBb24NbNh9B7GIwUr0+QaOxEx4j5kJAhriZJIBH2orPsOgyFEmHTHicYBtfzkubZMHwUwL237Ybh42x+1GsbjLQ2s4v1FjAafI+9uTQVRtyHHd++HEYHfdDtPoCh+SP5ktHWGKY7jWXnj8EwWYk0epKAYaFVUvO7+himOsqebp2NYbnpQot6N4y1I9mIthkw1kkVRn5aAWPpAra4wQ7GstPc45A/jEOD2PmvCTDR9+Pl/66EidsB7mz6DiaR3qw8eR5MilbhuO9IfFdoKU33HAZTyWTpmqvnYeoyIB0W1wzT8ABuFbMfphHagp9JFkzzd7DzNeUwLWjix5LnYLjuOjavbQDD9bpYwwQZzPXzeeZIe5i7FfHlY6JhHpnO9XQ0YV6Uzf7AZFjoX0JJfjtG6tnj47LfMUb1lf8TuxZWOhIu1bgGKyGKbTySDSuZIVdnF2EV/CtKms1gFfKWrRl8AtbOwTj9KgzjFGuRem0SxuV1SYfUpcBGV024kvoGNk5NwmHzMNiEtfAXw1phk/eQz1iWhfHiRub3Kg8T3WRS69mOmBi5RKj4/AATi8YLyht6sNVTsOjBI2CrP1PQv+sLW5d1KPRKga3bHM7GK2Cr8MNftybANnIGW9R5GFOUHdxYTQtTSky4oXgXphp0s9jvN2Cq/NsPDFGDnUQPipaNsHNpQMVuJWY6WbEyrfGYGWbB4nkpZubZsB3BnbAX6+HE+0T86GIpfJ4VhR8VU4T8gHj8WKAvhPuNhYNuH681T4SDcyUPXLYdDgoJqzibDof8t8jMmIY5xf/jTR71cDQI4dHLRsBRvoZfyDgDx4gHXOSZCkflXa6Rq4Rj4VXmmecMx+LDUH3JwVzJc6Yw8sACww6kDLfEgqi3CHH9CxDacGRUOfDoN5yRb4aQ746cB71wLnQVXB54wkUiF+aec4TLhPfY8PwNXFw9eJ98LlwifLhCZAiXQkuuu78LrhILpK/dDrl7Mte4dw3yqIssNfcj5CWJrLbtJdz17zH3xn1wN/yAv+9Zw92tB/F7f4Z7ZC8KFu7DCsUBJPZ0YEXBKubfcAkr9ZSIN+uDl1sAH2gwgFdkAK81bIZX0a+8hGVilf42PpudxSq3n1hH4iv4RM3E4ZZNWGM4GzErfoefJAaBP6zH1mIl14wUw9/gEh8tNMFfHs+nvk6CvzKevetygH+xN1vq8AFBSXLmm6SJIFUAW5d8HkGtXuz0jJPYq72GedfOw75iJyjV3LF/YwIOLZyOsNbXyLwchPAJYjRdP4VYk2c4dP0Qzn7Oxp6ENOQ+2odzz7XwbLINjiRb/B8ZS+lz</peaks>
</scan>
<scan num="3" msLevel="1" peaksCount="84" polarity="+" scanType="Full" filterLine="FTMS + p NSI Full ms [400.00-2000.00]" retentionTime="PT0.9075S" injectionTime="PT0.0023S" lowMz="407.983" highMz="1790.98" basePeakMz="445.12" basePeakIntensity="3.00796e+006" totIonCurrent="1.64408e+007">
<peaks precision="32" byteOrder="network" contentType="m/z-int" compressionType="none" compressedLen="0">Q8v920asXxtDzZYNRwcRFEPPhMhHQR1XQ8/i7kagqMtD0aWgRqjUyEPRqHNIwsWoQ9Io30fvKtBD1nIdSQBwQkPWi2pJLuT8Q9bykUexHVZD1wttSF/ySUPXccBJcLopQ9eLAkg0avND1/InR0Jy3UPYcVZJMPqXQ9jx1UfkrS5D2XEDSKTaGEPacJNHqFcdQ96Pa0o3l1ND3w94SYGzm0PfjNZHMKAUQ9+O9UlKvzZD36yESJBP00PgDvJIKL3bQ+As6kedu99D55DHSRq49kPnksFGptqKQ+ffWEamyqND6BDLSGjWckPokHtH7zsSQ+kQnkcFr3RD+LpLRw7pfEP7jcZHzBPnQ/wOBkcmXgpD/7xoR7XF/EQBil9G0CXGRAHI7UlzCfREAgjtSKzMwEQCSLdIZwmfRAKIxUeX5FxEBUw3SCa2UEQFjFZHHFy2RAWvRUbFIHpEBcwiRsxJ+kQLNWJGsVU6RAvMcEbNe0FEEMbIRvEJKUQUMsJGr9ZvRBRKKEiE0AhEFIosSExdg0QUygJH74qPRBUJ/0cuEa1EF817SMdhy0QYDX9Iioq7RBhNNUgOsMVEGI02R2ylHEQgUlpGoqg8RCbLZUfS0XBEJwtbR8oaPEQnS0dHQpPURCpOoUgT0eJEKo66R7HmVEQqzshHJfxdRCsOhUbtkfhENQntRsdXuUQ14uRGuu73RDZ3XkbuMw9ENxdNRql+CUQ4zf1GySnfRDlMkUe1PBJEOYyaR53SZUQ5zKxG3sT1RDoMYkbon1NEOvsNRqo4MkRLFOVGyFTLREtyJkbE4DFES4PARrhXdERLzbFHHdvDREwN60cPEzxElaiKRtVP1ESa4JNG6GD2RKnHHUbZld5EuR0xRuc/PkTf34JG/H+s</peaks>
</scan>
and instead of
vals <- lapply(all_peak_nodes, function(binary){
if(!nchar(binary))return(matrix(ncol = 2, nrow = 0))
decoded_binary <- base64enc::base64decode(binary)
raw_binary <- as.raw(decoded_binary)
decomp_binary <- memDecompress(raw_binary, type = file_metadata$compression)
final_binary <- readBin(decomp_binary, what = "numeric",
n=length(decomp_binary)/file_metadata$precision,
size = file_metadata$precision,
endian = file_metadata$endi_enc)
matrix(final_binary, ncol = 2, byrow = TRUE)
I'll have to do something like
all_peak_nodes <- xml2::xml_text(xml2::xml_find_all(xml_nodes, xpath = "d1:peaks"))
all_peak_encs <- xml2::xml_attr(xml2::xml_find_all(xml_nodes, xpath = "d1:peaks"), "compressionType")
vals <- mapply(function(binary, encoding_i){
if(!nchar(binary))return(matrix(ncol = 2, nrow = 0))
decoded_binary <- base64enc::base64decode(binary)
raw_binary <- as.raw(decoded_binary)
decomp_binary <- memDecompress(raw_binary, type = encoding_i)
final_binary <- readBin(decomp_binary, what = "numeric",
n=length(decomp_binary)/file_metadata$precision,
size = file_metadata$precision,
endian = file_metadata$endi_enc)
matrix(final_binary, ncol = 2, byrow = TRUE)
}, all_peak_nodes, peak_encodings)
Hi @wkumler, I have been looking into ways I could plot XICs from precursors and their respective fragments from DIA data independently of the search software. You package seems very promising, but I can see current functions are restricted to DDA and SRM. I was wondering if you have plans to implement functions to support DIA and specifically XICs from MS1 and MS2 for precursor and fragments?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.