Code Monkey home page Code Monkey logo

dynamicaxiswarping.jl's Introduction

DynamicAxisWarping.jl

CI codecov Documentation, latest

Dynamic Time Warping (DTW), matrix profile and related algorithms in Julia.

Dynamic Time Warping is a method used to compare, or measure the "distance" between two signals. Contrary to, e.g., the Euclidean distance between two signals, which operates point-wise, DTW compares the signals by warping them in time to produce the closest possible match, with the constraint that the warping is monotonic. Underneath the hood, DTW solves an $\mathcal{O}(n^2)$ dynamic-programming problem, and can thus be computationally expensive if implemented and used naively. This package attempts at providing a fairly well-optimized implementation, and further supports some well-known tricks to reduce the computational complexity, such as limiting the maximum allowed warping (radius) etc., effectively reducing the complexity to $\mathcal{O}(nr)$ where $r$ is the radius. DTW does not require the two signals under comparison to have the same length, making it a widely applicable measure of (dis)similarity.

This package supports arbitrary metrics and arbitrary "spaces", i.e., as long as you are passing a vector or higher dimensional array of something that your distance can operate on, you're good to go. Time is always considered to be the last dimension.

This package is registered and can be installed with:

using Pkg
pkg"add DynamicAxisWarping"

Simple usage

Inputs of dimension larger than 1 will be treated as sequences where time is in the last dimension. When using higher-dimensional series, make sure the provided distance accepts them.

Any distance implementing the Distances.jl interface works, as well as functions on the form dist(x,y) -> ℝ.

using DynamicAxisWarping, Distances, Plots
cost, i1, i2 = dtw(a,b, [dist=SqEuclidean()]; transportcost = 1)
cost, i1, i2 = fastdtw(a,b, dist, radius) # note https://arxiv.org/abs/2003.11246
cost = dtw_cost(a, b, dist, radius) # Optimized method that only returns cost. Supports early stopping, see docstring. Can be made completely allocation free.

# dtw supports arbitrary upper and lower bound vectors constraining the warping path.
imin,imax = radiuslimits(5,20,20), plot([imin imax])
dtw(a, b, dist, imin, imax) # Cost equivalent to dtw_cost(a, b, dist, 5)

# The Distances.jl interface is supported
d = DTW(radius=5)
d(a,b)

Plotting

dtwplot(a, b, [dist=SqEuclidean()]; transportcost = 1)
matchplot(a, b, [dist=SqEuclidean()])
matchplot2(a, b, [dist=SqEuclidean()])

Example:

using DynamicAxisWarping, Plots

fs = 70
t  = range(0,stop=1,step=1/fs)
y0 = sin.(2pi .*t)
y1 = sin.(3pi .*t)
y  = [y0;y1[2:end]] .+ 0.01 .* randn.()
q  = [y0;y0[2:end]] .+ 0.01 .* randn.()
y[10:15] .+= 0.5
q[13:25] .+= 0.5

f1 = plot([q y])
f2 = dtwplot(q,y,lc=:green, lw=1)
f3 = matchplot(q,y,ds=3,separation=1)
plot(f1,f2,f3, legend=false, layout=3, grid=false)

figure

Example, two-dimension timeseries, simple example

using DynamicAxisWarping, Plots, Distances

# Create signals q(u) ∈ ℜ², y(u) ∈ ℜ²
fs = 70
u  = collect(range(0,stop=1,step=1/fs))
# Create a template query signal
q, tq = [sin.(2pi .* u) cos.(2pi .* u)] .+ 0.01 .* randn.(), u
# Create a similar signal
y = [sin.(3pi .*u)  cos.(3pi .* u)] .+ 0.01 .* randn.()
last_peak = findlast(isapprox.(y[:,2], maximum(y[:,2]),atol=0.05))
y, ty = y[1:last_peak, :], u[1:last_peak]
y[end-10:end,:] .+= q[end-10:end,:]
y[10:13] .+= 0.5

# Plot signals
kws = (;linewidth=3, zlabel="index", xlabel="signal comp. 1", ylabel="signal comp. 2",
	   xticks=-1:1:1, yticks=-1:1:1, asepct_ratio=1, legend=nothing)
cs, cq = theme_palette(:auto).colors.colors[1:2]
orig= plot(eachcol(q)...,1:size(q,1); c=cq, label="query", kws...)
plot!(eachcol(y)..., 1:size(y,1) ; c=cs, label="similar signal", kws...)

# Warp 2D time signals and visualize
cost, i1, i2 = dtw(y', q', SqEuclidean(); transportcost = 1)
kws=(;kws..., legend=nothing)
warped=plot(eachcol(q[i2,:])..., 1:length(i2); c=cq, label="query", kws...);
plot!(eachcol(y[i1,:])..., 1:length(i1); c=cs, label="signal", kws..., linewidth=1);
plot(orig, warped)

figure

# Visualizing matched ℜ² signal points in ℜ²
kws=(;kws..., legend=nothing, ds=3, separation=0, xlabel="signal comp. 1", ylabel="signal comp. 2")
mp1 = matchplot2(y', q'; kws...)
# Or in ℜ³ with a signal index axis
mp2 = matchplot2(y', q'; showindex=true, zlabel="warped signal index", kws...)
plot(mp1, mp2)

figure

Find a short pattern in a long time series

The function dtwnn searches for a pattern in a long time series. By default, it does not normalize the data over each window, to do this, pass, e.g., ZNormalizer as the fifth argument.

using DynamicAxisWarping, Distances, Plots
radius = 5
a      = sin.(0.1 .* (1:100))     .+ 0.1 .* randn.()
b      = sin.(0.1 .* (1:100_000)) .+ 0.1 .* randn.()
res    = dtwnn(a, b, SqEuclidean(), radius, saveall=false, bsf_multiplier=1) # takes < 0.1s # DynamicAxisWarping.DTWSearchResult(0.4625287975222824, 73452, (prune_end = 79108, prune_env = 0))
plot([a b[eachindex(a) .+ (res.loc-1)]])
  • saveall causes the entire distance profile to be computed. This will take longer time to compute. It is stored in res.dists.
  • bsf_multiplier = 1: If > 1, require lower bound to exceed bsf_multiplier*best_so_far. This allows you to find several nearby points without having to compute the entire distance profile.
  • Available normalizers are: ZNormalizer, DiagonalZNormalizer, NormNormalizer

Multi-threaded search

Below is an example of how several long series y ∈ Y can be searched for the occurance of query q in a multithreaded fashion, using tmap from ThreadTools.jl. In this example, we first create a unique workspace object for each thread to save on allocations

using ThreadTools
const workspaces = map(1:Threads.nthreads()) do i
    DTWWorkspace(q, dist, radius)
end
@time results = tmap(Y) do y
    dtwnn(workspaces[Threads.threadid()], y, showprogress = false)
end
mincost, minind = findmin(results) # special method for Vector{DTWSearchResult}

Optimizations

The following optimizations are implemented.

  • Endpoint lower bound pruning
  • Envelope lower bound pruning
  • DTW early termination
  • Online normalization (see ZNormalizer)
  • Sorting of query series
  • All algorithms operate on arbitrary precision numbers. If you pass them Float32 instead of Float64, they can become up to twice as fast.

dtwnn is fairly performant, below is a small benchmark performed on a 2014 laptop

a = sin.(0.1f0 .* (1:100))    .+ 0.1f0 .* randn.(Float32)
b = sin.(0.1f0 .* (1:1000_000)) .+ 0.1f0 .* randn.(Float32)
@btime dtwnn($a, $b, SqEuclidean(), 5, ZNormalizer, prune_endpoints = true, prune_envelope = true)
# 853.336 ms (25519 allocations: 5.00 MiB)

Differentiable Soft-DTW

The Soft-DTW algorithm is provided through the function

soft_dtw_cost(a, b, [SqEuclidean()]; γ = 1, transportcost = 1)

γ is the smoothing parameters and a smaller value of γ makes the distance closer to the standard DTW distance.

To differentiate w.r.t. the first argument, try

using ReverseDiff
da = ReverseDiff.gradient(a->soft_dtw_cost(a,b; γ=1), a)

Zygote.jl will not work due to the array-mutation limitation. See also function soft_dtw_cost_matrix.

The following example illustrates how to calculate a barycenter (generalized average) using Soft DTW and Optim.jl, the result is shown below, together with three instances of the input series

barycenter

Generalized DTW

The gdtw function implements the algorithm from A General Optimization Framework for Dynamic Time Warping, which takes two continuous-time signals x and y on the interval [0,1], and warps them by means of warping functions ϕ, ψ, so that x ∘ ϕ ≈ y ∘ ψ, where either ψ(s) = 2s - ϕ(s) (both signals warped symmetrically, the default), or ψ(s)=s (only the x signal is warped). The method allows regularization by imposing penalties on ϕ(t) - t (the "cumulative warping") and on ϕ'(t) - 1 (the "instantaneous warping").

using LinearAlgebra
ts = range(0, stop=4π, length=128)
x = LinearInterpolation(sin.(ts) .+ 0.1 .* randn.())
y = LinearInterpolation(sin.(1.1 .* ts))

norm(x.(ts) - y.(ts)) # 1.7184237220575787

cost, ϕ, ψ = gdtw(x,y)

norm(x.(ϕ.(ts)) - y.(ψ.(ts))) # 0.9266090849096682

Clustering and barycenter averaging

barycenter = dba(vector_of_arrays)
result     = dbaclust(data, nclust, DTW())

Note that dba is known to not always produce the best barycenters. See, e.g., soft_dtw_cost above and "Soft-DTW: a Differentiable Loss Function for Time-Series" or "Spatio-Temporal Alignments: Optimal transport through space and time" for a method that produces better barycenters at the expense of a much higher computational cost.

Sparse distance matrix

The early termination and some of the stopping heuristics can be used to efficiently calculate a sparse distance matrix where only the k nearest neighbors are fully caluclated and stored. To this end, we have the function

dists, inds = sparse_distmat(y::Vector{Vector}, k, dist, radius)

Matrix Profile

This package defines specialized methods for MatrixProfile.matrix_profile, making use of early stopping to accelerate the computation of the matrix profile. The interface is

profile = matrix_profile(y, m, DTW(radius, [transportcost]))

transportcost

transportcost adds an additional penalty multiplier for "transporting", i.e., deviations from the Euclidean matching. The standard DTW distance does not consider this added cost and the default is 1. A value greater than 1 multiplies the cost of moving horizontally or vertically in the coupling matrix, promoting a diagonal move, corresponding to the standard Euclidean matching. The influence of the transport cost can be visualized with

using DynamicAxisWarping, Plots
a = sin.(1:100); b = sin.(1:100) .+ randn.();
dtwplot(a,b, transportcost=1)    # Default
dtwplot(a,b, transportcost=1.01) # Should be "more diagonal"
dtwplot(a,b, transportcost=1.1)  # Should be almost completely diagonal

You can try a transportcost < 1 as well, but then it is preferable to make weird alignments and I'm not sure how much sense that would make.

See also the keyword argument postprocess that allows you to pass a function D_modified = f(D) that is used to filter/post-process the cost matrix. Low-pass filtering can be used to effectively remove small-scale warping.

Combine with optimal transport

The distance between two datapoints can be any distance supporting the Distances.jl interface.

See the file frequency_warping.jl (notebook) for an example combining dynamic time warping with optimal transport along the frequency axis for spectrograms. This example makes use of SpectralDistances.jl.

Distances.jl interface

d = DTW(radius=radius, dist=SqEuclidean()) # Or FastDTW / SoftDTW
d(a,b)

Signal alignment

See SignalAlignment.jl which uses this package to align out-of-sync signals and signals with different sample rates.

Acknowledgements

This package is a fork of https://github.com/ahwillia/TimeWarp.jl which is no longer maintained.

Special thanks to Joseph Fowler (@joefowler) who contributed a substantial portion of the initial code for TimeWarp.jl

dynamicaxiswarping.jl's People

Contributors

ahwillia avatar baggepinnen avatar bloodworkxgaming avatar ericphanson avatar github-actions[bot] avatar holgerteichgraeber avatar jaksle avatar joefowler avatar maetshju avatar paciops avatar synapticsage avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dynamicaxiswarping.jl's Issues

Multi-thread version of dbaclust

From the docstring of dbaclust I see n_jobs but it is not implemented, could it be useful if I propose a multi-thread implementation of dbaclust and dbaclust_single where all the distpath are calculated in paralle?

A General Optimization Framework for Dynamic Time Warping

Hi @baggepinnen,

I came across this paper today https://arxiv.org/abs/1905.12893, and thought it was quite nice; it seemed like a sensible framework for introducing regularization similar to your transportcost parameter. I'm also not quite sure how their "time centered mean" corresponds to your barycenters. They also claim 50x speed ups compared to the python FastDTW package, but I'm not sure it would be faster than your code here. I thought I'd mention it here in case it was interesting to you as well.

Cheers,
Eric

[Question] How to measure the similarity between two time series using the package?

Hi,
I was looking for a package that provides DTW in Julia and I think the only active repository is this repo. I saw you've provided brief documentation about Find a short pattern in a long time series. Unfortunately I'm not an expert in the field of Signal Processing, hence I do not know what do you mean by "a short pattern". Anyway, I want to see whether one can use this package to assess the similarity of two time series. If yes, I would appreciate it if you could provide an example of achieving it.
Thank you

Cite this package

I am writing my bachelor thesis and I would like to cite this package because I used it to calculate the distance matrix of time series that I am working with.
Could I insert something like this ?

@software{DynamicAxisWarping,
  author = {Fredrik, Bagge, Carlson},
  title = {{Dynamic Axis Warping}},
  url = {https://github.com/baggepinnen/DynamicAxisWarping.jl},
  version = {0.4.11},
  year = {2020}
}

Implement `result_type` for `DTW`

Hi,

I was trying to use this package with NearestNeighborDescent and ran into an issue using the result_type function.

ERROR: BoundsError
Stacktrace:
 [1] size
   @ ./number.jl:81 [inlined]
 [2] lastlength
   @ ~/.julia/packages/SlidingDistancesBase/9GFtl/src/indexing.jl:17 [inlined]
 [3] evaluate(d::DTW{Euclidean, Nothing, Nothing}, x::Float64, y::Float64; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/ua7nU/src/distance_interface.jl:62
 [4] evaluate(d::DTW{Euclidean, Nothing, Nothing}, x::Float64, y::Float64)
   @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/ua7nU/src/distance_interface.jl:62
 [5] (::DTW{Euclidean, Nothing, Nothing})(x::Float64, y::Float64; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/ua7nU/src/distance_interface.jl:90
 [6] (::DTW{Euclidean, Nothing, Nothing})(x::Float64, y::Float64)
   @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/ua7nU/src/distance_interface.jl:90
 [7] result_type(f::DTW{Euclidean, Nothing, Nothing}, a::Type, b::Type)
   @ Distances ~/.julia/packages/Distances/6E33b/src/generic.jl:36
 [8] result_type(dist::DTW{Euclidean, Nothing, Nothing}, a::Vector{Float64}, b::Vector{Float64})
   @ Distances ~/.julia/packages/Distances/6E33b/src/generic.jl:35
 [9] top-level scope
   @ REPL[41]:1

It seems the default definition for result_type in Distances.jl doesn't work for the types in this package. It was my understanding that result_type was part of the interface for Distances.jl, so maybe the solve is implementing it as part of this package's types.

using weighted distances seemingly not compatible with dtw_cost

Using a weighted metric (e.g. WeightedSqEuclidean) for the dynamic time warping seems to not be possible as the weights passed to the distance are a vector, but in lines 190 and 199 in dtw.jl the cost is calculated using distances between scalar values (but in a for loop over the sequence). This results in a dimension mismatch error when trying to calculate the distance.

Dataset function is inactive

DynamicAxisWarping.Datasets.download_ucr() yields:

UndefVarError: download_ucr not defined

It seems that this function is not active any more.

Frequency warping example doesn't work

Based on

dtw(.√(M1.power), .√(M2.power), dist)[1]
, MWE:

julia> using DynamicAxisWarping, SpectralDistances, Distances

julia> X = rand(128,10);

julia> Y = rand(128,10);

julia> n = size(X, 1)

julia> dist = DiscreteGridTransportDistance(Cityblock(), Float64, n, n)
DiscreteGridTransportDistance{Cityblock, Float64, Float64, Vector{Float64}}(Cityblock(), [0.0 1.0  126.0 127.0; 1.0 0.0  125.0 126.0;  ; 126.0 125.0  0.0 1.0; 127.0 126.0  1.0 0.0], [0.0 0.0  0.0 0.0; 0.0 0.0  0.0 0.0;  ; 0.0 0.0  0.0 0.0; 0.0 0.0  0.0 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])

julia> dtw(X, Y, dist)
ERROR: ArgumentError: Sizes do not match, should be (length(x1),length(x2)) == size(D), but got (1, 1), (128, 128)
Stacktrace:
  [1] evaluate(d::DiscreteGridTransportDistance{Cityblock, Float64, Float64, Vector{Float64}}, x1::Float64, x2::Float64; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ SpectralDistances ~/.julia/packages/SpectralDistances/pJIYN/src/losses.jl:678
  [2] evaluate
    @ ~/.julia/packages/SpectralDistances/pJIYN/src/losses.jl:675 [inlined]
  [3] #_#49
    @ ~/.julia/packages/SpectralDistances/pJIYN/src/losses.jl:30 [inlined]
  [4] AbstractDistance
    @ ~/.julia/packages/SpectralDistances/pJIYN/src/losses.jl:30 [inlined]
  [5] result_type
    @ ~/.julia/packages/Distances/6E33b/src/generic.jl:36 [inlined]
  [6] result_type
    @ ~/.julia/packages/Distances/6E33b/src/generic.jl:35 [inlined]
  [7] pairwise(metric::DiscreteGridTransportDistance{Cityblock, Float64, Float64, Vector{Float64}}, a::Matrix{Float64}, b::Matrix{Float64}; dims::Int64)
    @ Distances ~/.julia/packages/Distances/6E33b/src/generic.jl:320
  [8] dtw_cost_matrix(seq1::Matrix{Float64}, seq2::Matrix{Float64}, dist::DiscreteGridTransportDistance{Cityblock, Float64, Float64, Vector{Float64}}; transportcost::Int64, postprocess::Nothing)
    @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/BErBL/src/dtw.jl:48
  [9] dtw_cost_matrix(seq1::Matrix{Float64}, seq2::Matrix{Float64}, dist::DiscreteGridTransportDistance{Cityblock, Float64, Float64, Vector{Float64}})
    @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/BErBL/src/dtw.jl:40
 [10] dtw(::Matrix{Float64}, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/BErBL/src/dtw.jl:26
 [11] dtw(::Matrix{Float64}, ::Vararg{Any})
    @ DynamicAxisWarping ~/.julia/packages/DynamicAxisWarping/BErBL/src/dtw.jl:25
 [12] top-level scope
    @ REPL[14]:1

Julia and packages:

julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 8 virtual cores
Environment:
  LD_LIBRARY_PATH = /opt/ros/noetic/lib::/usr/local/cuda/lib64

(jl_sG9PJn) pkg> st
Status `/tmp/jl_sG9PJn/Project.toml`
  [b4f34e82] Distances v0.10.7
  [aaaaaaaa] DynamicAxisWarping v0.4.12
  [2b0dec9d] SpectralDistances v0.1.14

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Questions about the package

The questions are formulated in the attached picture but in summary, for the 1D case:

  • Why continuing linking points in the two input curves when one of them is exhausted?
  • Do the input curves need to have the same start and end points and sampling rate?

DynamicTimeWarping_DTW_questions_for_FC

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.