Code Monkey home page Code Monkey logo

shaplets-python's Introduction

shaplets

Python implementation of the Learning Time-Series Shapelets method by Josif Grabocka et al., that learns a shapelet-based time-series classifier with gradient descent.

This implementation views the model as a layered graph, where each layer implements a forward, backword and parameters update methods (see below diagram). This abstraction simplifies thinking about the algorithm and implementing it. Network diagram

Differences from the paper

  • This implmenetation employs two (LinearLayer + SigmoidLayer) pairs instead of one (LinearLayer + SigmoidLayer) pair as in the paper (and shown in above diagram). This (using two pairs) has yielded improved results on some datasets. To have a similar setup as in the paper, simply update shapelets_lts/classification/shapelet_models.py:LtsShapeletClassifier._init_network().
  • The loss in this implementation is an updated version of the one in the paper to allow training a single model for all the classes in the dataset (rather than one model/class). The impact on performance was not analysed. For details check shapelets_lts/network/cross_entropy_loss_layer.py

Installation

git clone [email protected]:mohaseeb/shaplets-python.git
cd shaplets-python
pip install .
# or, for dev
# pip install .[dev]

Usage

from shapelets_lts.classification import LtsShapeletClassifier

# create an LtsShapeletClassifier instance
classifier = LtsShapeletClassifier(
    K=20,
    R=3,
    L_min=30,
    epocs=50,
    lamda=0.01,
    eta=0.01,
    shapelet_initialization='segments_centroids',
    plot_loss=True
)

# train the classifier. 
# train_data.shape -> (# train samples X time-series length) 
# train_label.shape -> (# train samples)
classifier.fit(train_data, train_label, plot_loss=True)

# evaluate on test data. 
# test_data.shape -> (# test samples X time-series length)
prediction = classifier.predict(test_data)

# retrieve the learnt shapelets
shapelets = classifier.get_shapelets()


# and plot sample shapelets
from shapelets_lts.util import plot_sample_shapelets
plot_sample_shapelets(shapelets=shapelets, sample_size=36)

Also have a look at example.py. For a stable training, the samples might need to be scaled.

Example plot from plot_sample_shapelets. sample_shapelets

shaplets-python's People

Contributors

mohaseeb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shaplets-python's Issues

Algorithm fails when time series is not scaled

Hi,

When I try to run the algorithm on my timeseries dataset which contains raw numbers, I get the below warning:
line 67 in soft_min_layer.py
*** RuntimeWarning: invalid value encountered in double_scalars

I believe its because negative exponential of large number is converted to zero when using numpy's exponent method. As I do not prefer to scale the data, is there any other work around ?

I tried to replace numpy's exp with mpmath's exp method, but I still get the same warning.

Doesn't start the epoch 0

When I use huge dataset(3000 rows, 80 columns) the program doesn't start the epoch 0 also, what is the reason? Any fix?

will it work for multivariate time series prediction both regression and classification

great code thanks
may you clarify :
will it work for multivariate time series prediction both regression and classification
1
where all values are continues values
weight height age target
1 56 160 34 1.2
2 77 170 54 3.5
3 87 167 43 0.7
4 55 198 72 0.5
5 88 176 32 2.3

2
or even will it work for multivariate time series where values are mixture of continues and categorical values
for example 2 dimensions have continues values and 3 dimensions are categorical values

color        weight     gender  height  age  target 

1 black 56 m 160 34 yes
2 white 77 f 170 54 no
3 yellow 87 m 167 43 yes
4 white 55 m 198 72 no
5 white 88 f 176 32 yes

Normalizing the dataset results in a single class prediction

I have used your implementation on my own dataset. It seems to work fine when the data is in its original format. However, whenever I normalize or standardize the data, it always predicts only one class (0 in my case), i.e., all the predictions are 0. I cannot seem to understand why is that. Do you have any idea?

Prediction accuracy for example.py

Hello,

I have ran example.py multiple times (200 epoch for training) for the GUN_point data set. I standardized the training/test data, and changed the code make sure it outputs a prediction label of 0 or 1.

But I can only get an accuracy of around 75%. Am I missing anything?

location shapelets

Hi,
is there a way to get the location of the shapelets with this library?

Using Shapelets Just for Feature Extraction

Great work on this library by the way! I am writing to inquire about how I could best apply shapelet transformation to my use case. If I simply wanted to get the transformed shaplets features on a multivariate time series (cross sectional/panel data), what would I do? What I think I would like to do is to do a shapelet transformation to generate features from a time series to then feed in a keras deep neural network for traditional binary classification. Does this sound advisable to you or would you recommend a different approach? Thanks again, John

availablity of shapelet clustering

hi there,

thanks for sharing the implementation, is it possible to provide the implementation of shapelet clustering in addition to classification?

best,

pandaa

Time Series of different length

Hey there, could you state if your implementation works on a dataset of time series that are of different lenght? Nice work btw. :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.