pnnl-m-q / mzapy Goto Github PK
View Code? Open in Web Editor NEWA Python package that provides an interface to raw MS data in the MZA format.
License: BSD 2-Clause "Simplified" License
A Python package that provides an interface to raw MS data in the MZA format.
License: BSD 2-Clause "Simplified" License
The MZA interface needs a bit of modification to handle cases where the MZA file being read may not be on the local filesystem (and possibly a read-only one). On initial thought, this will primarily affect caching of scan data which should only happen locally, regardless of whether the MZA file being read is local or remote. It will be good to examine all interactions between the MZA interface and the underlying file to see what else could be affected by the file being remote rather than local, and accommodate these cases as well.
add an option to the mzapy.view.plot_spectrum
function to be able to plot spectra in centroid mode (with bars for peaks) instead of only profile data
The package needs a module for doing unit tests. As the code base grows, there needs to be a set of standard tests that can assess functionality across the package, making sure that new additions do not end up unknowingly breaking other parts of the code base. This is especially important before pulling changes from the dev branch to main and updating distribution packages/online docs. I am not settled on a particular format for these tests, but once established it needs to remain consistent with new additions to the code base.
The calibration objects in the calibration module would be more useful if there was a non-awkward way of initializing an instance using already known optimized parameters and without needing to fit data. At present, this can be achieved by initializing the object with the fit=False
flag, then manually setting the object's opt_params
attribute to the correct fitted parameters. This is too awkward and it would be nice to have some sort of factory function that can do all of that cleanly behind the scenes. In implementing this, it is also worth considering the distinction between creating a calibration from known calibration function/parameters (but without necessarily having access to the individual calibrant information) and loading a previously made calibration (which would include information on the calibration function/parameters as well as individual calibrant info). The latter (i.e. a mechanism for saving/loading calibrations) is also worth implementing.
Some of the modules have subheads under "Module Reference" heading which looks nice on the main index page. For those that do not (e.g. the one with the MZA object) it just lists all the functions directly which doesnt look very nice. The best fix is to add subheadings under all of the "Module Reference" sections so that the main index page looks nice for all the modules. For the MZA object, it can be pretty easily divided into subheads for stuff like initilization/etc, functions for extracting data as DFs, functions for extracting data as arrays, and so on. For other modules, there is probably some way of categorizing the functions appropriately as well
I want to be able to do something like
peaks = find_peaks_1d_gauss(x_data, y_data, *other_params)
for peak_x, peak_ht, peak_wt in peaks:
# iterate one peak at a time
# and do stuff with the peak parameters
...
but instead it just returns separate arrays for each of the fitted parameters (x, ht, wt) so that you have to use zip
to iterate peak by peak. I feel that iterating peak by peak (without zip
, as in the example above) is a much more rational use case so these functions should return a list split by peaks rather than lists split by peak parameters.
Also a possible improvement would be to make them into generators that yield one peak at a time.
When scan caching is turned on for the MZA object, whenever scans are accessed increment one of two counters (as instance variables) reflecting whether it was a cache hit or miss. These values can even be stored in the scan cache file itself and keep track over a long term the cache hits/misses. At least as instance variables they could be pretty useful for characterizing performance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.