Code Monkey home page Code Monkey logo

dtw_resample's Introduction

Resample data based on DTW alignment

Purpose

The goal of this code is to allow irregular resampling of timestamped data. We assume we have a variable that informs about how advanced we are in the process for each timestamp in a time series. This variable is called base modality in the following. This variable could be, for example, the amount of discharge if we study water quality during a flood (i.e. in this case, aligning discharge corresponds to aligning floods in a plausible way).

We will then use this variable to align other modalities of our dataset. To do so, we record DTW path obtained when aligning base modality of each time series with that of a reference time series. This path is then used to perform irregular resampling of time series in our dataset w.r.t. alignment of base modalities.

We refer the interested reader to the following publication for more details:

@article{dupas:halshs-01228397,
  TITLE = {{Identifying seasonal patterns of phosphorus storm dynamics with dynamic time warping}},
  AUTHOR = {Dupas, R{\'e}mi and Tavenard, Romain and Fovet, Oph{\'e}lie and Gilliet, Nicolas and Grimaldi, Catherine and Gascuel-Odoux, Chantal},
  JOURNAL = {{Water Resources Research}},
  PUBLISHER = {{American Geophysical Union}},
  VOLUME = {51},
  NUMBER = {11},
  PAGES = {8868--8882},
  YEAR = {2015},
  DOI = {10.1002/2015WR017338},
  PDF = {https://halshs.archives-ouvertes.fr/halshs-01228397/file/article_WRR_accepte_avec_fig.pdf}
}

Also, if you use our code in a scientific publication, it would be nice to cite us using the above-mentionned reference :)

Code details

Example tests are provided in files test_sampling.py and test_clustering.py or, as notebooks, in sampling.ipynb and clustering.ipynb. In a few words, data should be resampled using the class DTWSampler that is supposed to be a standard sklearn transformer. Hence, fitting the sampler can be performed via:

from sampler import DTWSampler

s = DTWSampler(scaling_col_idx=0, reference_idx=0, d=d)
s.fit(data)

Here, data is a 2-dimensional array of shape (n_ts, l * d) where n_ts is the number of time series in the dataset, l is the length of a time series and d is the number of modalities provided for each time-stamp (including base modality). Basically, if you have your data stored in a 3-dimensional array data_3d of shape (n_ts, l, d), you should just do:

data = data_3d.reshape((data_3d.shape[0], -1))

scaling_col_idx is the index of the base modality and reference_idx is the index of the time series to be used as reference.

And applying the transformation to some new data is done via:

transformed_data = s.transform(newdata)

If one wants to do both fit and transform on the same data, the following should do it:

transformed_data = s.fit_transform(data)

Note that, in order to comply with sklearn standards, transformed_data is a 2d-array. If you want to get back to your (n_ts, n_samples, d) shape, just use:

transformed_data = s.fit_transform(data).reshape((data.shape[0], -1, s.d))

dtw_resample's People

Contributors

rtavenar avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.