Comments (14)
I am not a fan of my earlier suggestions, and I really see the value of having ultimate controll over what is calculated and filtered by, and in what order. I believe we can achieve everything we both want by the following proposal:
The proposal
- The starting point is a function
findpeaks
, a name which is more consistent with the package name Peaks.jl than the currentfindmaxima
. - The return of
findpeaks
is a named tuple (let's say we bind it topks
for later reference), with fieldsinds
anddata
. - There will be a set of functions that calculate and filter by features, such as
peakproms!
,peakheights!
,peakwidths!
,peakseps!
. - Calling e.g.
peakproms!
onpks
the then adds a fieldproms
by Base.merge, and ifmin
and/ormax
keywords are set, the contents of the named tuple will be filtered accordingly. - The
min
andmax
keywords will be consistent for the feature calculating/filtering functions.
This all means that the named tuple will in essence just bundle up the contents to facilitate passing the right things into the functions that filer by and calculate features. This should add negligible overhead and require no new types, while simplifying the API significantly and keep ultimate control over order of filtering. The filtering/calculating functions can also still have strict
set true or false at each step for ultimate controll, and the fields and functions are consistent: peakproms
adds the field proms
, peakwidths
adds widths
(along with leftedge
and rightedge
), etc. The filtering/calculating functions all follow the same prefix in their name, and almost the same methods: Taking the named tuple pks
as the first argument, and filtering by min
and max
, with only minor differences in potential additional keywords.
To take it one step further, we could make it so that if a named tuple is not passed to the filtering/calculating functions as the first argument, a function is returned that performs the action of adding the feature and filtering. This would allow easy chaining of operations:
pks = findpeaks(signal) |> peakproms!(min=3) |> peakwidths!(max=2)
That is almost as elegant as a single function with keyword arguments, but without giving up any control, performance, and without redundant calculations - all the strengths of the current API, as I see it.
A key benefit of this approach is also that you do not need to consult the docs every time to check the order of the e.g. 4 returned vectors of peakwidths
, instead relying on named fields of a named tuple. There is no overhead as far as I know, and named tuples can still be indexed and destructured, almost making it a superset of the old API.
How do you feel about this approach?
Note - A function sortpeaks!
would be nice to have. A rough draft is something like
"""
sortpeaks!(pks, property)
sortpeaks!(pks, property; inds, rev)
Sort the peaks `pks` by the property `property`.
To discard some of the elements of the sorted peaks, supply the keyword argument `inds` to
specify which indices in the sorted peaks you want to keep. For more detail, see `partialsortperm`, which is used internally.
To sort in reverse (descending order), set the keyword argument `rev` to `true`.
"""
function sortpeaks!(pks::NamedTuple, property::Symbol; inds=eachindex(first(pks)), rev::Bool=false)
# check if pks has field `property`
mask = partialsortperm(pks.property, 1:n; rev)
for i in eachindex(pks)
pks[i] = pks[i][mask][end-n+1 : end]
end
return pks
end
This function would allow to only keep the 5 widest peaks, or the 3 highest peaks, which is non-trivial without it due to having multiple vectors in pks
that should be synchronized.
from peaks.jl.
I have been unable to squeeze in the time to make a proposal for the API I am imagining. In a few days, I am leaving my computer behind for the summer (until about 2 weeks into september). It is crazy how a few hours of voluntary coding-work translates to weeks and months of calendar time...
I expect to give this a go, then, if you do not beat me to it. But I just wanted to tell you, so that you know that I will not be working on this for a long time.
from peaks.jl.
Yes, the vectors can be mutated. I was thinking more of adding fields to the named tuple when I talked of mutating.
I think you may be right - a certain inout type will have a known output type. The output type is different depending on if the field existed previously or not, but as that is reflected in the input type, it is likely not an issue. NamedTuples it is.
I will still be away from my computer for a month, but I do like having the exact plan ready ^_^
from peaks.jl.
No rush! I appreciate your interest and contributions.
from peaks.jl.
I acknowledge there are some minor inconveniences due to the API design, but I don't think the alternatives I'm aware of (e.g. a struct) would be better overall.
- Single function with filtering by kwargs VS separate filtering functions
- tl;dr: Separate filtering functions allow controlling the order (wrt. filtering by prominence, width, height) and behavior (
strict
or not) of operations, which isn't possible if the filtering is done in the main function (ie in a fixed order). - In my experience, there are often peaks I wish to filter/remove that have some similar characteristics (prominence, width, or height) to other peaks I wish to keep. With separate filtering functions, I can choose a filtering order/behavior that removes unwanted peaks using dissimilar characteristics before filtering by the similar characteristics (e.g. and can choose a more relaxed threshold for the similar characteristics, now that there are no bad apples that would slip through). In contrast, a combined filtering function would potentially apply the filter of similar characteristics first, and would apply the same strictness setting to all filters.
- I'm open to a PR that adds a function with all the filtering functionality built-in, but it would need to use an existing filter order (ie following MATLAB and/or scipy, hopefully they agree).
- tl;dr: Separate filtering functions allow controlling the order (wrt. filtering by prominence, width, height) and behavior (
- Returning peaks as a struct
- Unnecessarily complex imo, and would require a lot of care to avoid introducing a lot of extra allocations (some would be unavoidable). Working with indices directly is always going to be the simplest and most broadly composable.
redefining
pks
with the return values frompeakproms
andpeakwidths
to be rather weird and un-ergonomic
Providing mutating !
functions and copying, non-mutating functions, as I do with peakproms
, peakwidths
, etc, is a common Julian style. Having both functions offers maximum flexibility/composability.
It would also be nice to be able to provide the x-values, and get the corresponding x-values for the peaks and widths in terms of x-values, rather than having to work with indices always.
I don't understand what you mean here.
from peaks.jl.
I don't understand what you mean here.
On this point, I was hoping to be able to provide a vector of x-values, wher peak of index i
would have an x-coordinate of xvals[i]
, and automatically get peaks width and location in terms of x-coordinated. I have however changed my mind, as I take your point that this is less efficient, and complicated the API significantly. Perhaps a design filosophy note could be added, stating that
"This package only operates with the vector containing the signal, and it's indices. Not the corresponding values of a vector containing the independent variable. This simplifies the API, and increases efficiency. If the corresponding independent variables of peaks and their widths is needed, this is best done manually outside the functions provided in this package."
Or perhaps a much shorter of a similar message. I realize that this message is heavily taylored to what I needed to hear, and that it may not be that relevant for other users.
from peaks.jl.
tl;dr: Separate filtering functions allow controlling the order (wrt. filtering by prominence, width, height) and behavior (strict or not) of operations, which isn't possible if the filtering is done in the main function (ie in a fixed order).
This is an excellent point, and one that I have not at all thought of. My current best idea of an API is a single function findpeaks
that would have somewhat complicated internals, but that would allow all that we want/need in terms of functionality and API:
It would take a keyword argument filters
, a named tuple that specifies filtering options (minprom, maxprom, window (w), minwidth, maxwidth, minheight, maxheight). The order of the filters would determine the order of the filtering to allow full control. I then imagine relheight
and strict
to be seperate keyword arguments.
The return value could be a named tuple, containing (peakkinds, widths, proms, heights, widths, leftedges, rightedges). If the user does not want to spent computation resources calculating any of the quantities, specify a related filer as NaN
, and the return will be be NaN
for e.g. the widths, leftedges and rightedges. This would maintain the return type Vector{<:Number}
in any case, maintaining type stability.
This requires only knowledge of one function, and much less code to perform filtering. This is a very rough idea, and I am very curious if you feel like some solution in this direction is at all possible, and if you see the value add I do from not having separate functions, and from not having to pass around the relevant vectors outside the functions.
from peaks.jl.
That sounds very functional, and would also solve my only minor gripe about the current API, which is that the order of the output arguments and input arguments prevents direct splatting peakproms
into peakwidths
, e.g.:
pks, proms = peakproms(argmaxima(x), y))
peakwidths(pks, y, proms)
making it a superset of the old API
👌 Nice.
I don't have time to get to this in the near future, but if you put together a PR, I will certainly review it.
from peaks.jl.
Nice. I will see if I find the time to put together a proposal soon. I will initially make the current functionality internal by prefixing _'s and not exporting, and building the new API on top of that as a first approximation.
from peaks.jl.
I have realized that adding fields to a named tuple is not possible, so mutating versions of the functions could not be made. Instead, the user will have to rebind the original variable name to the output each time a feature is added.
Also, adding fields changes the type. It is possble this makes type stability impossible, but I am not sure.
Dictionaries could solve both problems, at the cost of adding a dependance and giving up the amazing field-access syntax pks.proms
, replacing it with the rather tedious `pks["proms"]. I am wondering if using a Dictionary (the benefit over Dict is that order is preserved) while fields are being added, and finally advising users to create a NamedTuple from it.
My current though is that it is better to live with unmutability and potential type instability, for the field access syntax and more simplicity.
from peaks.jl.
Mutating a tuple isn't possible, but same as with structs, mutable fields in a immutable struct/object can be mutated; I think that is what would be needed here? I don't believe that using a vector in a tuple copies the vector, so it should be possible to accept a NamedTuple (containing index, values, etc vectors) as an input argument, mutate those vectors, and then return a new NamedTuple (eg containing old and new data, such as peak proms) without making any copies of the original/mutated vectors?
I'm not sure type stability is a concern, since that only applies at the function return type level, and if switching to a NamedTuple design, the returned value/type would be consistent for a given function/input eltype.
from peaks.jl.
I have a draft up and running at https://github.com/KronosTheLate/Peaks.jl/tree/api_rework !
It is still missing a lot of detail, which I tried to collect into a "ToDo" section in the README. Most notably, I have just hidden the old functions by prefixing a "_", and called them internally in the new functions. I do however feel like the user-facing functions work as I imagine, and feel ready to discuss the proposed API with you (item 1 in the ToDo). If/when we get the API right, we move on to the following bullet points, and after that I believe the API will be reworked into something better!
Some example code to get started:
julia> using Peaks # Using the api_rework branch
julia> x = Float64[1, 2, 3, 5, 5, 0, 2, 0, 8, 9, 8, 9, 0]; # If `x` is a vector of eltype int, I get an error. Should be looked into
julia> function test1(x)
pks = findpeaks(x)
pks = peakproms!(pks, min=0.5)
pks = peakwidths!(pks)
pks = peakheights!(pks)
return pks
end
test1 (generic function with 1 method)
julia> function test2(x)
pks = findpeaks(x) |> peakproms!(min=0.5) |> peakwidths!() |> peakheights!()
return pks
end
test2 (generic function with 1 method)
julia> test1(x)
(data = [1.0, 2.0, 3.0, 5.0, 5.0, 0.0, 2.0, 0.0, 8.0, 9.0, 8.0, 9.0, 0.0], inds = [4, 7, 10, 12], proms = [4.0, 2.0, 1.0, 1.0], widths = [2.4000000000000004, 1.0, 1.0, 0.5555555555555554], edges = [(3.0, 5.4), (6.5, 7.5), (9.5, 10.5), (11.5, 12.055555555555555)], heights = [5.0, 2.0, 9.0, 9.0])
julia> values(test1(x)) .== values(test2(x))
(true, true, true, true, true, true)
from peaks.jl.
That example looks promising! Please open a draft PR so that I can review/add comments & suggestions.
Ideally, I'd prefer to have both interfaces (the old and the new) available if possible (for backwards compatibility, etc).
from peaks.jl.
Please open a draft PR
I am a little busy these days with school, and https://juliapackagecomparisons.github.io/. I will get around to it at some point ^_^
from peaks.jl.
Related Issues (19)
- Functions are not importing HOT 1
- Findpeaks.jl
- Unusual behavior of peakprom? HOT 2
- peakprom can't find a peak within the minimum distance `w` of the boundary HOT 6
- findpeaks from pracma HOT 4
- Performance HOT 1
- TagBot trigger issue HOT 13
- Bibliography HOT 1
- add 4 to compat? HOT 2
- plotpeaks doesn't exist? HOT 2
- Add peak separation HOT 2
- Add number of peaks to look for HOT 3
- Add absolute lower value HOT 2
- Better discoverability of functions and keyword arguments
- peakheights and other not working with array of floats HOT 3
- Wrong local maximum HOT 2
- How to set width in `findmaxima` HOT 2
- Add `findpeaks` as alias for `findmaxima` HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peaks.jl.