Comments (3)
But currently, I don't believe we support Numpress compression.
from flashlfq.
We could use this Numpress implementation. https://github.com/topdownproteomics/sdk/tree/master/src/TopDownProteomics/Tools
from flashlfq.
Mol Cell Proteomics. 2014 Jun; 13(6): 1537–1542.
Published online 2014 Mar 27. doi: 10.1074/mcp.O114.037879
PMCID: PMC4047472
PMID: 24677029
Numerical Compression Schemes for Proteomics Mass Spectrometry Data*
Johan Teleman,‡ Andrew W. Dowsey,§¶ Faviel F. Gonzalez-Galarza,‖ Simon Perkins,‖ Brian Pratt,** Hannes L. Röst,‡‡ Lars Malmström,‡‡ Johan Malmström,§§ Andrew R. Jones,‖ Eric W. Deutsch,¶¶,a,b and Fredrik Levander‡‖‖,b
Author information ► Article notes ► Copyright and License information ► Disclaimer
This article has been cited by other articles in PMC.
Go to:
Abstract
The open XML format mzML, used for representation of MS data, is pivotal for the development of platform-independent MS analysis software. Although conversion from vendor formats to mzML must take place on a platform on which the vendor libraries are available (i.e. Windows), once mzML files have been generated, they can be used on any platform. However, the mzML format has turned out to be less efficient than vendor formats. In many cases, the naïve mzML representation is fourfold or even up to 18-fold larger compared with the original vendor file. In disk I/O limited setups, a larger data file also leads to longer processing times, which is a problem given the data production rates of modern mass spectrometers. In an attempt to reduce this problem, we here present a family of numerical compression algorithms called MS-Numpress, intended for efficient compression of MS data. To facilitate ease of adoption, the algorithms target the binary data in the mzML standard, and support in main proteomics tools is already available. Using a test set of 10 representative MS data files we demonstrate typical file size decreases of 90% when combined with traditional compression, as well as read time decreases of up to 50%. It is envisaged that these improvements will be beneficial for data handling within the MS community.
from flashlfq.
Related Issues (20)
- Duplicated column names in BayesianFoldChangeAnalysis.tsv
- Support for Percolator output files HOT 77
- Advice on parameter setting HOT 10
- RT alignment? HOT 2
- Support for more conditions
- FlashLFQ normalization issue HOT 8
- setting up license on linux HOT 2
- FlashLFQ crashed HOT 1
- extra trailing tab in output HOT 1
- will/does it support timsTOF data? HOT 1
- Question about quantification
- crashed - invalid parametrization for the distribution HOT 6
- Enquiry on where the actual namespace for FlashLFQ exists HOT 2
- Question about using with PeptideShaker HOT 3
- No Posterior Error Probability in Bayesian Protein Fold Change Analysis in Command Line tool
- Problem with file name excluding format extension HOT 5
- Shared peptide quantification
- Fail in the command-line mode HOT 2
- Questions about the generic input format HOT 3
- Commandline version dependent on Microsoft.WindowsDesktop.App Framework HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flashlfq.