In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences which may vary in speed. DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restrictions. The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.
In this script, I use the python package (fastdtw) and apply it to a Energy Usage time series dataset found here: UCI Machine Learning - Individual household electric power consumption Data Set
I've provided the Zeppelin notebook (note.json), which can be copied in to your environment and/or you can also view at https://www.zeppelinhub.com/viewer
zaratsian / dynamic_time_warping Goto Github PK
View Code? Open in Web Editor NEWSpark (PySpark) script that applies dynamic time warping to Energy usage data (using the python fastdtw package)