Code Monkey home page Code Monkey logo

matrixprofile's Introduction

MPF Logo

|

PyPI Version

PyPI Downloads

Conda Version

Conda Downloads

Code Coverage

Azure Pipelines

Build Status

Platforms

License

Twitter

Discord

JOSSDOI

ZenodoDOI

MatrixProfile

NOTE: THIS LIBRARY IS NOT ACTIVELY SUPPORTED. PLEASE CHECK OUT THE TD AMERITRADE STUMPY LIBRARY INSTEAD: https://github.com/TDAmeritrade/stumpyhttps://github.com/TDAmeritrade/stumpy

MatrixProfile is a Python 3 library, brought to you by the Matrix Profile Foundation, for mining time series data. The Matrix Profile is a novel data structure with corresponding algorithms (stomp, regimes, motifs, etc.) developed by the Keogh and Mueen research groups at UC-Riverside and the University of New Mexico. The goal of this library is to make these algorithms accessible to both the novice and expert through standardization of core concepts, a simplistic API, and sensible default parameter values.

In addition to this Python library, the Matrix Profile Foundation, provides implementations in other languages. These languages have a pretty consistent API allowing you to easily switch between them without a huge learning curve.

Python Support

Currently, we support the following versions of Python:

  • 3.5
  • 3.6
  • 3.7
  • 3.8
  • 3.9

Python 2 is no longer supported. There are earlier versions of this library that support Python 2.

Installation

The easiest way to install this library is using pip or conda. If you would like to install it from source, please review the installation documentation for your platform.

Installation with pip

pip install matrixprofile

Installation with conda

conda config --add channels conda-forge
conda install matrixprofile

Getting Started

This article provides introductory material on the Matrix Profile: Introduction to Matrix Profiles

This article provides details about core concepts introduced in this library: How To Painlessly Analyze Your Time Series

Our documentation provides a quick start guide, examples and api documentation. It is the source of truth for getting up and running.

Algorithms

For details about the algorithms implemented, including performance characteristics, please refer to the documentation.

Getting Help

We provide a dedicated Discord channel where practitioners can discuss applications and ask questions about the Matrix Profile Foundation libraries. If you rather not join Discord, then please open a Github issue.

Contributing

Please review the contributing guidelines located in our documentation.

Code of Conduct

Please review our Code of Conduct documentation.

Citations

All proper acknowledgements for works of others may be found in our citation documentation.

Citing

Please cite this work using the Journal of Open Source Software article.

Van Benschoten et al., (2020). MPA: a novel cross-language API for time series analysis. Journal of Open Source Software, 5(49), 2179, https://doi.org/10.21105/joss.02179

@article{Van Benschoten2020,
    doi = {10.21105/joss.02179},
    url = {https://doi.org/10.21105/joss.02179},
    year = {2020},
    publisher = {The Open Journal},
    volume = {5},
    number = {49},
    pages = {2179},
    author = {Andrew Van Benschoten and Austin Ouyang and Francisco Bischoff and Tyler Marrs},
    title = {MPA: a novel cross-language API for time series analysis},
    journal = {Journal of Open Source Software}
}

matrixprofile's People

Contributors

bruno-hanzen avatar burk avatar demiand avatar earthgecko avatar frankiecancino avatar franzbischoff avatar ksairahul21 avatar llewellyns96 avatar lmmentel avatar luyueee avatar mend-bolt-for-github[bot] avatar nikita-smyrnov avatar nimasarajpoor avatar rexking6 avatar spriithy avatar tylerwmarrs avatar vanbenschoten avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

matrixprofile's Issues

Bug when calculating snippets with a small window size

Describe the bug
When calculating snippets, an error is thrown

To Reproduce

import matrixprofile as mp

snippet_size = 10
taxi = mp.datasets.load('nyc-taxi-anomalies')
ts = taxi['data']

snippets = mp.discover.snippets(ts, snippet_size, 2, snippet_size / 2)

Expected behavior

The snippets should be calculated correctly using all allowed snippet sizes

Error log

/usr/local/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/usr/local/lib/python3.7/site-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/usr/local/lib/python3.7/site-packages/numpy/core/_methods.py:140: RuntimeWarning: Degrees of freedom <= 0 for slice
  keepdims=keepdims)
/usr/local/lib/python3.7/site-packages/numpy/core/_methods.py:110: RuntimeWarning: invalid value encountered in true_divide
  arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
/usr/local/lib/python3.7/site-packages/numpy/core/_methods.py:132: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
[1]    42091 segmentation fault  python3 src/snippets/snippets-bug.py

Desktop (please complete the following information):

  • OS: macOS Catalina
  • Python version: Python 3.7.3
  • Matrixprofile version: 1.1.0

Additional context

The error is also raised with my own dataset where the snippet size is ≤ 180.

Unable to install package - error: unsupported option '-fopenmp'

Describe the bug
Hello,
I'm trying to install tsfresh==0.19.0 on my macOS 12.1 (arm) but the installation fail when installing matrixprofile, at build step with the folliwing error :

clang: error: unsupported option '-fopenmp'
      error: command '/usr/bin/clang' failed with exit code 1

I tried with python 3.8 and 3.9, same error.

To Reproduce
Steps to reproduce the behavior:

  1. Create a python 3.9 environment on macOS X 12.1 arm
  2. conda install llvmlite (pip install fails)
  3. pip install tsfresh
  4. See error

Expected behavior
tsfresh installation without errors.

Desktop (please complete the following information):

  • OS: macOS X 12.1
  • X Code Version 13.2.1
  • python 3.9 (conda env)

[Deprecated] Drop Python 3.5 Support

While building for Python 3.5, the following deprecation warning appears. We should eventually remove support for Python 3.5.

DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.

AttributeError: module 'matrixprofile' has no attribute 'datasets'

I installed matrixprofile a week ago and I am trying to reproduce the example in the documentation
https://matrixprofile.docs.matrixprofile.org/examples/ECG_Annotation_Vectors.html
but as soon as I arrive to

import matrixprofile as mp
import numpy as np

from matplotlib import pyplot as plt
%matplotlib inline

ecg = mp.datasets.load('ecg-heartbeat-av')
ts = ecg['data']
window_size = 150

I get the error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-93a2bc01725a> in <module>
      6 
      7 
----> 8 ecg = mp.datasets.load('ecg-heartbeat-av')
      9 ts = ecg['data']
     10 window_size = 150

AttributeError: module 'matrixprofile' has no attribute 'datasets'

I even tried mp.compute and I got the error

AttributeError: module 'matrixprofile' has no attribute 'compute'

Am I doing something wrong or are there missing packages or....?

Understanding/Use Case: Clustering of repeated patterns

Hi,

I recently found sth strange while using a matrix profile.

I used the data (1d-array: 8760 hourly data between 0 and 1, so not normalized) saved into this CSV file:
https://drive.google.com/file/d/1dZgaEZjD2HsnXu_wWaGKPCPRbDJ1_dty/view?usp=sharing

And here is my simple code:

`
#mydata: read from csv file (with length: 8760)
H=24
profile = mp.compute(mydata, windows=[H])
'
image

'
ind_min = np.argmin(profile['mp'])
ind_min_pair = profile['pi'][ind_min]

print('min of dist [best match of all]', profile['mp'][ind_min])
plt.plot(mydata[ind_min:ind_min+H])
plt.plot(mydata[ind_min_pair : ind_min_pair+H])
plt.show()
`

So, the goal was to find the closest pair of sequences (with length H=24). And here is the result:

image

It is noteworthy to mention that I also checked out the plot for the first few indices (the matches for index from 0 to 20) and the plots were good. However, when I tried to get the best match of all, I got the result attached above!

The same problem exists for these two pair of indices as well: (44,424) and (46,426), with min dist=0
or (63,425):

image

Am I missing sth here? Isn't MP 1-dim array is: np.min(distance_matrix,axis=1)? (where distance_matrix[i,j] is the Euclidean distance between ts[i,i+H] and ts[j,j+H]) ???

If I understand correctly, mp calculated is the Euclidean Distance between normalized sequences. However, mydata[421:421+H] is all zeros. Do you know how the algorithm treats a situation where x (one of the subsequences in the time series) is all zeros: x_norm = (x-0) / 0 ???

##############################################################################################
I tried to figure out how muinvn function works but I couldn't.

Python 3.9 Binary Wheels for Linux and MacOS

Our wheel build system is reliant on the package cibuildwheel. Currently, our TravisCI script has some environment issues with Python and pulling down the required version of cibuildwheel for Python 3.9 support. The version required for cibuildwheel to build wheels on Python 3.9 is v1.6.2.

See these logs for Linux and MacOS issues:

https://travis-ci.com/github/matrix-profile-foundation/matrixprofile/jobs/464939617
https://travis-ci.com/github/matrix-profile-foundation/matrixprofile/jobs/464939618

Windows is successfully building wheel for Python 3.9.

Documentation: History section

@vanbenschoten would you be interested in writing a section within the README explaining how the library came into existence? Some highlights could include:

  • matrixprofile-ts
  • Target
  • MPF (how we all met)

This may give the community some context into why matrixprofile-ts is not maintained anymore.

`mp.compute` fails in tsfresh example data

Describe the bug
When calling mp.compute on one timeseries of the robot example dataset (e.g. for id = 1), matrixprofile first throws a warning

/home/nils/anaconda3/envs/tsfresh/lib/python3.7/site-packages/matrixprofile/algorithms/skimp.py:339: RuntimeWarning: No windows found with given threshold, try to set a lower threshold

and then raises an exception

TypeError                                 Traceback (most recent call last)
<ipython-input-14-d031a7c608e1> in <module>
----> 1 mp.compute(ts)

~/anaconda3/envs/tsfresh/lib/python3.7/site-packages/matrixprofile/compute.py in compute(ts, windows, query, sample_pct, threshold, n_jobs, preprocessing_kwargs)
    120         # from 8 in steps of 2 until upper w
    121         start = 8
--> 122         windows = range(start, profile['upper_window'] + 1)
    123 
    124         # compute the pmp

TypeError: 'float' object is not subscriptable

Looking at the code, I think that maximum_subsequence returns a NaN when it has not found a window (here), which is not the format expected by compute (which expects a dictionary of some sort).

The root cause for this, is probably that the timeseries under study is quite short (length 15), so I would be happy to understand if this is a systematic problem for short time series (the fact that no windows are found, not the fact that the exception is raised :-))

To Reproduce

> conda install tsfresh
> pip install matrixprofile
from tsfresh.examples.robot_execution_failures import download_robot_execution_failures, load_robot_execution_failures
download_robot_execution_failures()
df, _ = load_robot_execution_failures()
ts = df[df.id == 1]['F_x'].values

import matrixprofile as mp
mp.compute(ts, threshold=0.0)

Expected behavior
No exception :-)

Desktop (please complete the following information):

  • OS: ubuntu 20.04
  • Version: 1.1.6

Additional context
Related to blue-yonder/tsfresh#785

Please note that if you are using tsfresh's example robot data, one timeseries are all values for a single id (so the full data sample has actually 88 time series in it)

Motif discovery does only return one motif and the same neighbor multiple times

Describe the bug
When I've tried to compute and discover motifs on my time series data, I noticed that the method mp.discover.motifs always returns only one motif and the same neighbor multiple times.

I've also tried to compute the motifs on the predefined "ecg-heartbeat-av" dataset by following exactly the steps as shown in this example of the documentary.
Unfortunately, I encounter the same problem and get results which are different from the example in the documentary.

Also, when I change the parameter k=3, I get the same motif multiple times. Furthermore, no matter which value I set for the parameter max_neighbor, I always get the same single neighbor that many times.

To Reproduce
Steps to reproduce the behavior:
Just follow the steps mentioned in the example from the documentary:

import matrixprofile as mp
ecg = mp.datasets.load('ecg-heartbeat-av')
ts = ecg['data']
profile = mp.compute(ts, windows=150)
profile = mp.discover.motifs(profile, k=1)
mp.visualize(profile)

The motif plot I get looks like:
motifs_neighbors_1
Note that I only get one neighbor, which is returned 10 times (default value of max_neighbors).
If I would change, for example, k=3, I get the same motif three times.

Expected behavior
The plot from the mentioned example:
motifs_neighbors_2
As seen in the plot from the documentary, I would expect to get different neighbors. Also, I would expect that the method returns different motifs when the parameter k is increased.

Screenshots
See the plots above.

Desktop (please complete the following information):

  • OS: Ubuntu LTS
  • Python 3.8.10

Error installing matrixprofile with pip

Describe the bug
I was trying to use tsfresh and began by installing it with pip install tsfresh command. During the command execution I got an error installing matrixprofile. I really would love to use tsfresh in my app, and deploy it in a Heroku server. However I can't solve this problem since I cannot install matrixprofile

To Reproduce
Steps to reproduce the behavior:

  1. Install tsfresh with pip install tsfresh
  2. Check error log and see that it fails on matrixprofile

Expected behavior
Unable to install package due to matrixprofile. Mentions it needs C++ 14.0 or higher. Does not work when downloaded.
Moreover, I need it to function on a linux Heroku server, where I cannot easily install nothing

Screenshots
Not applicable (check logs below)

Desktop (please complete the following information):

  • OS: Windows 11
  • Python 3.10.1
  • pip 22.0.3

Additional context

Copy of all the console logs

pip install tsfresh
Collecting tsfresh
Using cached tsfresh-0.19.0-py2.py3-none-any.whl (97 kB)
Requirement already satisfied: dask[dataframe]>=2.9.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (2022.1.1)
Requirement already satisfied: scikit-learn>=0.22.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (1.0.2)
Requirement already satisfied: numpy>=1.15.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (1.21.5)
Requirement already satisfied: tqdm>=4.10.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (4.62.3)
Collecting matrixprofile<2.0.0,>=1.1.10
Using cached matrixprofile-1.1.10.tar.gz (331 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: requests>=2.9.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (2.27.1)
Requirement already satisfied: statsmodels>=0.13 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (0.13.2)
Requirement already satisfied: pandas>=0.25.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (1.3.5)
Requirement already satisfied: cloudpickle in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (2.0.0)
Requirement already satisfied: patsy>=0.4.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (0.5.2)
Requirement already satisfied: scipy>=1.2.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (1.7.3)
Collecting distributed>=2.11.0
Using cached distributed-2022.1.1-py3-none-any.whl (830 kB)
Requirement already satisfied: stumpy>=1.7.2 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tsfresh) (1.10.2)
Requirement already satisfied: toolz>=0.8.2 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from dask[dataframe]>=2.9.0->tsfresh) (0.11.2)
Requirement already satisfied: partd>=0.3.10 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from dask[dataframe]>=2.9.0->tsfresh) (1.2.0)
Requirement already satisfied: fsspec>=0.6.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from dask[dataframe]>=2.9.0->tsfresh) (2022.1.0)
Requirement already satisfied: packaging>=20.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from dask[dataframe]>=2.9.0->tsfresh) (21.3)
Requirement already satisfied: pyyaml>=5.3.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from dask[dataframe]>=2.9.0->tsfresh) (6.0)
Requirement already satisfied: click>=6.6 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (7.1.2)
Requirement already satisfied: setuptools in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (58.1.0)
Requirement already satisfied: tblib>=1.6.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (1.7.0)
Requirement already satisfied: jinja2 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (3.0.3)
Requirement already satisfied: tornado>=6.0.3 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (6.1)
Requirement already satisfied: msgpack>=0.6.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (1.0.3)
Requirement already satisfied: zict>=0.1.3 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (2.0.0)
Requirement already satisfied: sortedcontainers!=2.0.0,!=2.0.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (2.4.0)
Requirement already satisfied: psutil>=5.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from distributed>=2.11.0->tsfresh) (5.9.0)
Requirement already satisfied: matplotlib>=3.0.3 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from matrixprofile<2.0.0,>=1.1.10->tsfresh) (3.5.1)
Collecting protobuf==3.11.2
Using cached protobuf-3.11.2-py2.py3-none-any.whl (434 kB)
Requirement already satisfied: six>=1.9 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from protobuf==3.11.2->matrixprofile<2.0.0,>=1.1.10->tsfresh) (1.16.0)
Requirement already satisfied: pytz>=2017.3 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from pandas>=0.25.0->tsfresh) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from pandas>=0.25.0->tsfresh) (2.8.2)
Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from requests>=2.9.1->tsfresh) (2.0.10)
Requirement already satisfied: idna<4,>=2.5 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from requests>=2.9.1->tsfresh) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from requests>=2.9.1->tsfresh) (1.26.8)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from requests>=2.9.1->tsfresh) (2021.10.8)
Requirement already satisfied: joblib>=0.11 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from scikit-learn>=0.22.0->tsfresh) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from scikit-learn>=0.22.0->tsfresh) (3.0.0)
Requirement already satisfied: numba>=0.48 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from stumpy>=1.7.2->tsfresh) (0.55.1)
Requirement already satisfied: colorama in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from tqdm>=4.10.0->tsfresh) (0.4.4)
Requirement already satisfied: pillow>=6.2.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from matplotlib>=3.0.3->matrixprofile<2.0.0,>=1.1.10->tsfresh) (9.0.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from matplotlib>=3.0.3->matrixprofile<2.0.0,>=1.1.10->tsfresh) (1.3.2)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from matplotlib>=3.0.3->matrixprofile<2.0.0,>=1.1.10->tsfresh) (3.0.6)
Requirement already satisfied: cycler>=0.10 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from matplotlib>=3.0.3->matrixprofile<2.0.0,>=1.1.10->tsfresh) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from matplotlib>=3.0.3->matrixprofile<2.0.0,>=1.1.10->tsfresh) (4.28.5)
Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from numba>=0.48->stumpy>=1.7.2->tsfresh) (0.38.0)
Requirement already satisfied: locket in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from partd>=0.3.10->dask[dataframe]>=2.9.0->tsfresh) (0.2.1)
Requirement already satisfied: heapdict in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from zict>=0.1.3->distributed>=2.11.0->tsfresh) (1.0.1)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\35191\appdata\local\programs\python\python310\lib\site-packages (from jinja2->distributed>=2.11.0->tsfresh) (2.0.1)
Building wheels for collected packages: matrixprofile
Building wheel for matrixprofile (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [97 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.10
creating build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\analyze.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\compute.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\core.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\discover.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\exceptions.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\preprocess.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\transform.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\utils.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\version.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\visualize.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile_init_.py -> build\lib.win-amd64-3.10\matrixprofile
creating build\lib.win-amd64-3.10\tests
copying tests\test_analyze.py -> build\lib.win-amd64-3.10\tests
copying tests\test_annotation_vector.py -> build\lib.win-amd64-3.10\tests
copying tests\test_compute.py -> build\lib.win-amd64-3.10\tests
copying tests\test_core.py -> build\lib.win-amd64-3.10\tests
copying tests\test_cycore.py -> build\lib.win-amd64-3.10\tests
copying tests\test_datasets.py -> build\lib.win-amd64-3.10\tests
copying tests\test_hierarchical_clustering.py -> build\lib.win-amd64-3.10\tests
copying tests\test_io.py -> build\lib.win-amd64-3.10\tests
copying tests\test_mass2.py -> build\lib.win-amd64-3.10\tests
copying tests\test_mpdist.py -> build\lib.win-amd64-3.10\tests
copying tests\test_mpx.py -> build\lib.win-amd64-3.10\tests
copying tests\test_pairwise_dist.py -> build\lib.win-amd64-3.10\tests
copying tests\test_preprocess.py -> build\lib.win-amd64-3.10\tests
copying tests\test_regimes.py -> build\lib.win-amd64-3.10\tests
copying tests\test_scrimp.py -> build\lib.win-amd64-3.10\tests
copying tests\test_skimp.py -> build\lib.win-amd64-3.10\tests
copying tests\test_snippets.py -> build\lib.win-amd64-3.10\tests
copying tests\test_statistics.py -> build\lib.win-amd64-3.10\tests
copying tests\test_stomp.py -> build\lib.win-amd64-3.10\tests
copying tests\test_top_k_discords.py -> build\lib.win-amd64-3.10\tests
copying tests\test_top_k_motifs.py -> build\lib.win-amd64-3.10\tests
copying tests\test_utils.py -> build\lib.win-amd64-3.10\tests
copying tests\test_visualize.py -> build\lib.win-amd64-3.10\tests
copying tests_init_.py -> build\lib.win-amd64-3.10\tests
creating build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\hierarchical_clustering.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\mass2.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\mpdist.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\mpx.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\pairwise_dist.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\regimes.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\scrimp.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\skimp.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\snippets.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\statistics.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\stomp.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\top_k_discords.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\top_k_motifs.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms_init_.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
creating build\lib.win-amd64-3.10\matrixprofile\datasets
copying matrixprofile\datasets\datasets.py -> build\lib.win-amd64-3.10\matrixprofile\datasets
copying matrixprofile\datasets_init_.py -> build\lib.win-amd64-3.10\matrixprofile\datasets
creating build\lib.win-amd64-3.10\matrixprofile\io
copying matrixprofile\io_init_.py -> build\lib.win-amd64-3.10\matrixprofile\io
copying matrixprofile\io_io.py -> build\lib.win-amd64-3.10\matrixprofile\io
creating build\lib.win-amd64-3.10\matrixprofile\io\protobuf
copying matrixprofile\io\protobuf\protobuf_utils.py -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
copying matrixprofile\io\protobuf\proto_messages_pb2.py -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
copying matrixprofile\io\protobuf_init.py -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
running egg_info
writing matrixprofile.egg-info\PKG-INFO
writing dependency_links to matrixprofile.egg-info\dependency_links.txt
writing requirements to matrixprofile.egg-info\requires.txt
writing top-level names to matrixprofile.egg-info\top_level.txt
reading manifest file 'matrixprofile.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '.csv' under directory '.'
warning: no files found matching '.json' under directory '.'
adding license file 'LICENSE'
writing manifest file 'matrixprofile.egg-info\SOURCES.txt'
copying matrixprofile\cycore.c -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\cycore.pyx -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\algorithms\cympx.c -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying tests\mpx_ab_mpa.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_ab_mpb.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_ab_mpia.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_ab_mpib.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_mp.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_mpi.txt -> build\lib.win-amd64-3.10\tests
copying tests\muinvn_mua.txt -> build\lib.win-amd64-3.10\tests
copying tests\muinvn_stda.txt -> build\lib.win-amd64-3.10\tests
copying tests\sampledata.txt -> build\lib.win-amd64-3.10\tests
copying tests\scrimp.mp.txt -> build\lib.win-amd64-3.10\tests
copying tests\scrimp.mpi.txt -> build\lib.win-amd64-3.10\tests
copying matrixprofile\algorithms\cympx.pyx -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\io\protobuf\proto_messages.proto -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
running build_ext
building 'matrixprofile.algorithms.cympx' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for matrixprofile
Running setup.py clean for matrixprofile
Failed to build matrixprofile
Installing collected packages: protobuf, matrixprofile, distributed, tsfresh
Attempting uninstall: protobuf
Found existing installation: protobuf 3.19.4
Uninstalling protobuf-3.19.4:
Successfully uninstalled protobuf-3.19.4
Running setup.py install for matrixprofile ... error
error: subprocess-exited-with-error

× Running setup.py install for matrixprofile did not run successfully.
│ exit code: 1
╰─> [97 lines of output]
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.10
creating build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\analyze.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\compute.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\core.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\discover.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\exceptions.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\preprocess.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\transform.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\utils.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\version.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\visualize.py -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile_init_.py -> build\lib.win-amd64-3.10\matrixprofile
creating build\lib.win-amd64-3.10\tests
copying tests\test_analyze.py -> build\lib.win-amd64-3.10\tests
copying tests\test_annotation_vector.py -> build\lib.win-amd64-3.10\tests
copying tests\test_compute.py -> build\lib.win-amd64-3.10\tests
copying tests\test_core.py -> build\lib.win-amd64-3.10\tests
copying tests\test_cycore.py -> build\lib.win-amd64-3.10\tests
copying tests\test_datasets.py -> build\lib.win-amd64-3.10\tests
copying tests\test_hierarchical_clustering.py -> build\lib.win-amd64-3.10\tests
copying tests\test_io.py -> build\lib.win-amd64-3.10\tests
copying tests\test_mass2.py -> build\lib.win-amd64-3.10\tests
copying tests\test_mpdist.py -> build\lib.win-amd64-3.10\tests
copying tests\test_mpx.py -> build\lib.win-amd64-3.10\tests
copying tests\test_pairwise_dist.py -> build\lib.win-amd64-3.10\tests
copying tests\test_preprocess.py -> build\lib.win-amd64-3.10\tests
copying tests\test_regimes.py -> build\lib.win-amd64-3.10\tests
copying tests\test_scrimp.py -> build\lib.win-amd64-3.10\tests
copying tests\test_skimp.py -> build\lib.win-amd64-3.10\tests
copying tests\test_snippets.py -> build\lib.win-amd64-3.10\tests
copying tests\test_statistics.py -> build\lib.win-amd64-3.10\tests
copying tests\test_stomp.py -> build\lib.win-amd64-3.10\tests
copying tests\test_top_k_discords.py -> build\lib.win-amd64-3.10\tests
copying tests\test_top_k_motifs.py -> build\lib.win-amd64-3.10\tests
copying tests\test_utils.py -> build\lib.win-amd64-3.10\tests
copying tests\test_visualize.py -> build\lib.win-amd64-3.10\tests
copying tests_init_.py -> build\lib.win-amd64-3.10\tests
creating build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\hierarchical_clustering.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\mass2.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\mpdist.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\mpx.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\pairwise_dist.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\regimes.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\scrimp.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\skimp.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\snippets.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\statistics.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\stomp.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\top_k_discords.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms\top_k_motifs.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\algorithms_init_.py -> build\lib.win-amd64-3.10\matrixprofile\algorithms
creating build\lib.win-amd64-3.10\matrixprofile\datasets
copying matrixprofile\datasets\datasets.py -> build\lib.win-amd64-3.10\matrixprofile\datasets
copying matrixprofile\datasets_init_.py -> build\lib.win-amd64-3.10\matrixprofile\datasets
creating build\lib.win-amd64-3.10\matrixprofile\io
copying matrixprofile\io_init_.py -> build\lib.win-amd64-3.10\matrixprofile\io
copying matrixprofile\io_io.py -> build\lib.win-amd64-3.10\matrixprofile\io
creating build\lib.win-amd64-3.10\matrixprofile\io\protobuf
copying matrixprofile\io\protobuf\protobuf_utils.py -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
copying matrixprofile\io\protobuf\proto_messages_pb2.py -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
copying matrixprofile\io\protobuf_init.py -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
running egg_info
writing matrixprofile.egg-info\PKG-INFO
writing dependency_links to matrixprofile.egg-info\dependency_links.txt
writing requirements to matrixprofile.egg-info\requires.txt
writing top-level names to matrixprofile.egg-info\top_level.txt
reading manifest file 'matrixprofile.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '.csv' under directory '.'
warning: no files found matching '.json' under directory '.'
adding license file 'LICENSE'
writing manifest file 'matrixprofile.egg-info\SOURCES.txt'
copying matrixprofile\cycore.c -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\cycore.pyx -> build\lib.win-amd64-3.10\matrixprofile
copying matrixprofile\algorithms\cympx.c -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying tests\mpx_ab_mpa.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_ab_mpb.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_ab_mpia.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_ab_mpib.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_mp.txt -> build\lib.win-amd64-3.10\tests
copying tests\mpx_mpi.txt -> build\lib.win-amd64-3.10\tests
copying tests\muinvn_mua.txt -> build\lib.win-amd64-3.10\tests
copying tests\muinvn_stda.txt -> build\lib.win-amd64-3.10\tests
copying tests\sampledata.txt -> build\lib.win-amd64-3.10\tests
copying tests\scrimp.mp.txt -> build\lib.win-amd64-3.10\tests
copying tests\scrimp.mpi.txt -> build\lib.win-amd64-3.10\tests
copying matrixprofile\algorithms\cympx.pyx -> build\lib.win-amd64-3.10\matrixprofile\algorithms
copying matrixprofile\io\protobuf\proto_messages.proto -> build\lib.win-amd64-3.10\matrixprofile\io\protobuf
running build_ext
building 'matrixprofile.algorithms.cympx' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> matrixprofile

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Gracefully handle zero window size, zero data

Describe the bug
If the time series passed to matrix profile is all zeroes or any constant, or a zero window-size is passed in, errors are raised which do not make the cause obvious.

To Reproduce
profile, figures = mp.analyze(np.zeros(500)) -> TypeError: 'float' object is not subscriptable

dataset = mp.datasets.load('motifs-discords-small')
ts = dataset['data']
profile, figures = mp.analyze(ts, windows=[40, 0])
...
ValueError: could not broadcast input array from shape (876) into shape (875)

Expected behavior
The code raise a ValueError, or produce some output that is concordant with bad input.

Documentation review

@tylerwmarrs I really love the documentation: I think it's laid out in a very easy-to-follow fashion and captures the necessary details. I realize there are a few sections yet to add that are pretty critical (installation, examples); I had a couple of other thoughts as well:

  1. The quickstart guide could be made a little more user-friendly, primarily through more detail. That said, are we targeting novices who may have only read the Medium post?

  2. There are a few minor grammar changes I'd make.

I'll take a pass through and propose some edits.

Multi-Dimensional Matrix Profiles

Overview
Right now all of the algorithms implemented support one-dimensional matrix profiles. There have been a handful of requests on our discord channel asking for multi-dimensional support. This issue is used to keep track of the request and give an opportunity to the community for contributions.

Todos

  • Data structure supporting multiple dimensions
  • Motif discovery
  • Discord discovery
  • Figure out what algorithms should be updated. I'm guessing MPX and SCRIMP++ only?
  • Update analyze and compute higher-level functions to pass off multi-dimensional time series to the right algorithm.
  • Update I/O module

Relevant papers
Matrix Profile VI: Meaningful Multidimensional Motif Discovery. Chin-Chia Michael Yeh, Nickolas Kavantzas, Eamonn Keogh. [pdf] ICDM 2017.

Domain Agnostic Online Semantic Segmentation for Multi-Dimensional Time Series. Shaghayegh Gharghabi, Chin-Chia Michael Yeh, Yifei Ding, Wei Ding, Paul Hibbing, Samuel LaMunion, Andrew Kaplan, Scott E. Crouter, Eamonn Keogh. Data Mining and Knowledge Discovery.

Threshold

When I use mp.compute or mp.analyze, I get the error "NoSolutionPossible: Given the threshold {:.2f}, no window was found. Please try increasing your threshold." How do I adjust it?

Snippet: A Request_and_Probable Bug(?)_and_Elaboration

Hello,

Thanks for creating this amazing package. It helps others to use it for different applications.
I have three things to discuss regarding the snippets:

(1) It would be great if the locations of all snippets could be returned as well. (By that, I mean all the subsequences that are close to a snippet which is considered as the representative of those subsequences).

(2) "Probable" Bug: I have an hourly data recorded in a year (array: 8760-hr time series). I tried to find two snippets (with length 24, and window size=12) and I got a result. Then, I just deleted the first 12 samples (i.e. new_array = array[12:]). Rationally speaking, there should be no change in the snippets, but that is not the case as the results have been changed drastically. I also tried array[1:] and the result was the same as the first one but with one hour shift.
I noticed that, in all three cases, the index (given in the result), from which the snippet starts, is multiple of 24 (my snippet size) which might be a little bit strange.

(3) I tried to read the snippet paper published by Dr. Eamon Keogh's team. I found the subsequent length m, the snippet size. However, I couldn't find the purpose of window size, the one used in the snippet function in this package. It would be great if you could give some elaboration on this matter.

Best,
Nima

minor change?

Hi,

After a long delay, working on my paper and studying git, I pulled and realized Yue Lu had already modified the snippet algorithm.

As I was reviewing the code, I realized that he added a new key "neighbors" to the snippet dictionary; however, the docstring wasn't updated accordingly.

If I find some small issues like that again, should I mention here or modify it (since I already have it on my local pc) and do merge request?

Thanks,
Nima

Deploy matrixprofile library through conda

Is your feature request related to a problem? Please describe.
Currently, conda users have no easy way of downloading the matrixprofile library. This is a problem both from an accessibility standpoint, and for various initiatives on the MPF roadmap.

Describe the solution you'd like
matrixprofile can be downloaded through conda-install matrixprofile

Describe alternatives you've considered
N/A

Additional context
N/A

I'll be taking this task in tandem with @frankiecancino , unless other members of the community would like to help :)

PMP compatibility with string data types.

In the white paper "Matrix Profile XX: Finding and Visualizing Time Series Motifs of All Lengths using the Matrix Profile", section 3 demonstrates the Pan Matrix Profile with string data, however this capability is not included in the current implementation.

This shouldn't be too hard to implement. The example looks for an identical string match, but Hamming distance (or some other distance metric) could be used measure similarity/dissimilarity between strings. Such a feature has the advantage of avoiding the use of Random Projection in order to find string motifs with noise included in the signal, while simultaneously searching all possible sequence lengths.

The alternative is Random projection through every motif sequence length (N/2 where N is the length of the time series) and all reasonable discrepancy counts.

Issue with Cython -> gcc dependency on Mac install

The re-design currently leverages Cython, which in turn requires openmp for parallelization. The version of gcc installed on Mac is actually clang without openmp, so the code cannot compile properly. I believe updating to a different version of gcc (w/openmp) should do the trick, but unfortunately I can't figure out how to update setup.py to use a different version of gcc (maybe just change environment variables via os.environ?).

Also, we should think through what this means for our Mac customer base. I worry that requiring the installation of a new compiler (no matter how easy) will be an additional step for our user base that will reduce adoption.

Error in question:

(ds_env) Andrews-MacBook-Pro:matrixprofile-ts-redesign andrewvanbenschoten$ python setup.py install
/usr/local/Cellar/python/3.6.5_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running bdist_egg
running egg_info
writing matrixprofile_ts.egg-info/PKG-INFO
writing dependency_links to matrixprofile_ts.egg-info/dependency_links.txt
writing requirements to matrixprofile_ts.egg-info/requires.txt
writing top-level names to matrixprofile_ts.egg-info/top_level.txt
reading manifest file 'matrixprofile_ts.egg-info/SOURCES.txt'
writing manifest file 'matrixprofile_ts.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.13-x86_64/egg
running install_lib
running build_py
running build_ext
building 'matrixprofile.algorithms.cympx' extension
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/Users/andrewvanbenschoten/code/python3/ds_env/include -I/usr/local/Cellar/python/3.6.5_1/Frameworks/Python.framework/Versions/3.6/include/python3.6m -c matrixprofile/algorithms/cympx.c -o build/temp.macosx-10.13-x86_64-3.6/matrixprofile/algorithms/cympx.o -O3 -march=native -fopenmp
clang: error: unsupported option '-fopenmp'
error: command 'clang' failed with exit status 1

TravisCI reports successful release on PyPi when it really fails

In some cases, pushing a release tag to Travis results in false positive success. When executing the wheel upload command to make it public on PyPI, the command is always returning success. We need to change this so that when it fails the pipeline reports a failure instead of success.

Here is an example of the command returning an error response from PyPi:
image

A solution could be to wrap the command output in bash and check for an error or not on these lines:

https://github.com/matrix-profile-foundation/matrixprofile/blob/master/.travis.yml#L81
https://github.com/matrix-profile-foundation/matrixprofile/blob/master/.travis.yml#L95
https://github.com/matrix-profile-foundation/matrixprofile/blob/master/.travis.yml#L109

The paper of mpx

Can someone share the paper of mpx?

I can't find the related papers.

Thank you!!!

Documentation - Snippets Tutorial

Snippets algorithm exists in the library, but no example of how to use it exists in the documentation. The tutorial should cover the following:

  1. High-level layman overview of a "snippet"
  2. A practical use case (real-world data)

Annotation Vectors

Annotation vectors provide a way to "filter" a matrix profile to avoid noisy results. In essence, an annotation vector is a vector containing values ranging from 0 to 1 where 0 means "less desirable" and 1 means "very desirable". The length of the annotation vector is the same as the matrix profile and the annotation vector is applied to a matrix profile to create a "corrected matrix profile" (CMP).

Describe the solution you'd like

  • There should be some default annotation vectors to apply.
  • Where does this fit into the current model? Is it simply a utility? We could introduce a new "transform" concept.
  • The annotation vector should be stored alongside the MP data structure.

Example code with new "transform" concept:

import matrixprofile as mp
ts = None # load some data
w = 100 # the window size
profile = mp.compute(ts, windows=w)
av = None # some default annotation vector or user defined one
profile = mp.transform.apply_av(profile, av)

# a different idea is to just put this in as a utility function
profile = mp.utils.apply_av(profile, av)

The original profile data structure is returned with an additional "av" and "cmp" key in both cases.

Where would we actually store some default annotation vectors? These can be derived from a time series. For example, our Golang library has annotation vectors implemented here:

https://github.com/matrix-profile-foundation/go-matrixprofile/blob/master/av/annotation_vector.go

Additional context

You can find the paper describing annotation vectors here:
https://www.cs.ucr.edu/~eamonn/guided-motif-KDD17-new-format-10-pages-v005.pdf

Produce a source distribution file to pypi (tar.gz format)

Is your feature request related to a problem? Please describe.
In some corporate environments, it's common to import external libraries into an internal repository for internal dependency resolution. In order to import matrixprofile for use, a tar.gz file is needed similar to the python libraries requests or pytest.

Describe the solution you'd like
A tar.gz file uploaded to pypi as part of the release. See requests or pytest as examples. Links below.

Describe alternatives you've considered
None.

Additional context

Discover - Clusters (KMedoids, DBScan, Hierarchical)

Is your feature request related to a problem? Please describe.
MPDist provides a distance between two pairs of time series. By itself, it is not really useful. It would be ideal to have a way to actually cluster the time series.

Describe the solution you'd like
Taking the MPDist and a clustering algorithm, such as KMedoids, will provide more insight into many time-series data. The goal is to expose a new "discover" method to provide clustering.

My initial thinking is a function call as follows:

mp.discover.clusters(timeseries, algorithm="alg", **kwargs)

Where time-series is a list of time series objects and algorithm is the respective algorithm listed in the title. More thought is required in order to determine the other keyword arguments, output of the function and how to visualize the results.

Additional context
MPDist is currently implemented. You may find the documentation here.

KMedoids algorithm
DBScan algorithm
Hierarchical algorithm

In the MPDist paper, the authors utilize hierarchical clustering through Matlab. Scipy does offer similar functionality.

Here is a link to the MPDist paper.

pip install matrixprofile forces protobuf==3.11.2

Describe the bug
We need protobuf > 3.11.2 for other dependencies.

To Reproduce
Steps to reproduce the behavior:

  1. pip install matrixprofile
  2. pip install --upgrade protobuf
    Should install protobuf v3.17.3 which it does. It also errors saying matrixprofile 1.1.10 requires protobuf==3.11.2, but you have protobuf 3.17.3 which is incompatible. despite the fact that matrixprofile/requirements.txt in v1.1.10 says protobuf>=3.11.2,<4.0.0
  3. pip install matrixprofile
    This uninstalls protobuf and installs protobuf==3.11.2, which is the wrong behavior

Expected behavior
I expect protobuf to be left alone, since the stated requirements for matrixprofile is just protobuf>=3.11.2,<4.0.0

Additional context
I believe the problem is that matrixprofile/setup.py required protobuf==3.11.2 until March 8, 2021, and the latest release was released on January 16, 2021. I think the bug would be fixed by creating another release.

pip installing matrixprofile from the commit where setup.py relaxed the protobuf version requirement resolves the issue

Documentation - API - Return output is poorly formatted

Describe the bug
Some of the output of the documentation is formatted poorly. This is because it is a verbose dictionary output. While this issue only covers one example, there are several places where the API docs need cleaned up.

To Reproduce
Steps to reproduce the behavior:

Simply browse to the documentation. An example issue can be found here:

http://matrixprofile.docs.matrixprofile.org/api.html#matrixprofile-discover-statistics

Expected behavior
The documentation should render the appropriate dictionary. I think adding code blocks will make this problem go away.

Screenshots
image

Strict dependency on protobuf 3.11.2

Describe the bug

There still seems to be a strict dependency on protobuf 3.11.2, even though this commit seems to try to remove that strict version dependency: 37173af

I'm not able to lock the version of matrixprofile inside a pipenv with another library which requires protobuf >= 3.12.0

To Reproduce
Steps to reproduce the behavior:

  1. Download the latest Python 3.8 matrixprofile linux wheel.
  2. Unzip it.
  3. Look inside the file matrixprofile-1.1.10.dist-info/METADATA
  4. Observe the line Requires-Dist: protobuf (==3.11.2)

Desktop (please complete the following information):

  • OS: Ubuntu 20.10
  • Version: Python 3.8, matrixprofile 1.110

Windows Subsystem for Linux - Core Dumped

Describe the bug
When running any Cython code in WSL it errors out with "Illegal instruction (core dumped)". This is due to major differences in the binary wheel of a true Linux versus WSL.

To Reproduce
Steps to reproduce the behavior:

  1. install matrixprofile via pip on WSL
  2. try to run the following
import matrixprofile as mp

ds = mp.datasets.load('motifs-discords-small')
ts = ds['data']
profile = mp.compute(ts)

Result is an error:

Illegal instruction (core dumped)

Expected behavior
This should give the profile as expected.

Desktop (please complete the following information):

  • OS: Windows 10 64 bit
  • WSL: Ubuntu - 4.4.0-18362-Microsoft #476-Microsoft Fri Nov 01 16:53:00 PST 2019 x86_64 x86_64 x86_64 GNU/Linux

Workaround
Installing from source works fine.

sudo apt install build-essential
git clone [email protected]:matrix-profile-foundation/matrixprofile.git
cd matrixprofile
pip install -e .

`matrixprofile.analyze()` raises AttributeError with `matplotlib==3.5.0`

Running matrixprofile.analyze() causes AttributeError: 'NoneType' object has no attribute 'get_subplotspec' when matplotlib version 3.5.0 is installed. All plots seem to be shown correctly, but due to the exception, the function never returns. Upgrading to matplotlib 3.5.1 resolves the issue.

The matrixprofile package lists matplotlib>=3.0.3 in requirements.txt. I haven't tested any other versions for this issue, but it might be worth adding matplotlib!=3.5.0 to the package requirements if this is a matplotlib bug and can't be fixed in matrixprofile itself.

To Reproduce
Environment setup:

python3.9 -m venv env
source env/bin/activate
pip install --upgrade pip
pip install matrixprofile ipykernel matplotlib==3.5.0

Run the following in an IPython notebook:

import numpy as np
import matrixprofile as mp

np.random.seed(1)
vals = np.random.randn(500)
profile, figures = mp.analyze(vals)

It should throw an AttributeError. Then, restart the kernel, upgrade matplotlib with pip install matplotlib==3.5.1, and rerun the cell (it should run successfully).

Expected behavior
matrixprofile.analyze() returns without any errors on both matplotlib 3.5.0 and 3.5.1.

Desktop

  • OS: MacOS 12.0.1
  • Python: 3.9.6

Update Readme about Python 2 Support

With the majority of the Python community dropping Python 2 support, we need to clarify that we support Python 2 through version X. It is no longer maintained after.

Restrict Motif discovery to subsequences starting at specific locations

When I am analyzing data that has daily fluctuations, I create a annotation vector that is 1 at midnight of each day and 0 everywhere else. This helps prioritize subsequences that start at midnight so each set of motifs have the same 24 hour structure.

The issue is that applying an annotation vector does not prevent the motif algorithm from picking a motif pair where one starts at midnight and the other does not. A new mechanism would have to be defined to restrict these.

Also distance profiles that are calculated inside the motif algorithm do not apply the annotation vector. This could be added and triggered when use_cmp = True without any new mechanisms.

I can write a custom motif finding code that does this, but if others would like the functionality I'd be happy to contribute.

fail to install by source code setup.py

python setup.py install

logs

running install
running bdist_egg
running egg_info
writing matrixprofile.egg-info/PKG-INFO
writing dependency_links to matrixprofile.egg-info/dependency_links.txt
writing requirements to matrixprofile.egg-info/requires.txt
writing top-level names to matrixprofile.egg-info/top_level.txt
reading manifest template 'MANIFEST.in'
warning: no files found matching '.csv' under directory '.'
warning: no files found matching '
.json' under directory '.'
writing manifest file 'matrixprofile.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.9-x86_64/egg
running install_lib
running build_py
copying matrixprofile/version.py -> build/lib.macosx-10.9-x86_64-3.7/matrixprofile
running build_ext
building 'matrixprofile.algorithms.cympx' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/zhoucheng/anaconda3/include -arch x86_64 -I/Users/zhoucheng/anaconda3/include -arch x86_64 -I/usr/local/opt/openblas/include -I/Users/zhoucheng/anaconda3/lib/python3.7/site-packages/numpy/core/include -I/Users/zhoucheng/anaconda3/lib/python3.7/site-packages/numpy/core/include -I/Users/zhoucheng/anaconda3/include/python3.7m -c matrixprofile/algorithms/cympx.c -o build/temp.macosx-10.9-x86_64-3.7/matrixprofile/algorithms/cympx.o -O2 -fopenmp
clang: error: unsupported option '-fopenmp'
error: command 'gcc' failed with exit status 1

Multiprocessing

Describe the bug
n_jobs argument not passed to "maximum_subsequence" and "skimp" in "Analyze_pmp"

To Reproduce

  1. call mp.analyze with a time series and an n_jobs value with different n_jobs values
  2. monitor cpu utilization on a multi-core system
  3. whatever the n_jobs value, the task uses a single core, and execution time is the same
  4. See error

Expected behavior

  1. if n_jobs >1, several cores are used
  2. n_jobs has an impact on execution time

Root cause
n_jobs argument not passed to "maximum_subsequence" and "skimp" in "Analyze_pmp"

Solution

add "n_jobs=n_jobs" in the calls:
if isinstance(windows, type(None)):
profile = maximum_subsequence(ts, threshold, include_pmp=True, n_jobs=n_jobs)

profile = skimp(ts, windows=windows, sample_pct=sample_pct, \n
pmp_obj=profile, n_jobs=n_jobs)

Result
For the same series, on the same computer:
Execution time of "mp.analyze"
n_jobs = 12: 1158.7 s.
n_jobs = 8: 1068.4 s.
n_jobs = 4: 1069.6 s.
n_jobs = 1: 1632.3 s

Publish a version to pip that includes `mstomp` algorithm.

Is your feature request related to a problem? Please describe.
The most recent version of matrixprofile on pip is from January 2021. Since then , the mstomp algorithm was added to the official repo. The repo includes mstomp algorithm, but it's not available via pip

Describe the solution you'd like
Could you publish a newer version to PIP that includes mstomp mentioned in this issue?

Describe alternatives you've considered
I tried installing directly from github with pip install git+https://github.com/matrix-profile-foundation/matrixprofile, but this results in an error: clang: error: unsupported option '-fopenmp'. I don't know very much about the PIP installer process, but I could probably look into this error more closely if it's helpful.

Additional context
Many thanks for taking a look!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.