Code Monkey home page Code Monkey logo

robuststl's Introduction

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

This repository contains python (3.5.2) implementation of RobustSTL (paper) .

Decomposing complex time series into trend, seasonality, and remainder components is an important task to facilitate time series anomaly detection and forecasting.
RobustSTL extract trend using LAD loss with sparse regularization and non-local seasonal filtering.
Compared to previous approaches (such as traditional STL), RobustSTL has advantages on

  1. Ability to handle seasonality fluctuation and shift, and abrupt change in trend and reminder
  2. robustness of data with anomalies
  3. applicability on time series with long seasonality period.

Requirments & Run

First, install some required libraries using pip.

pip3 install -r requirments.txt
python3 main.py

Sample Results

We generate a synthetic sample (sample_generator.py) and decompose it into trend, seasonality, and remainder. In run_example.ipynb, I attach the example codes to use RobustSTL and the outputs.

Codes

  • main.py : run example code
  • RobustSTL.py: contains RobustSTL algorithm and each process.
  • utils.py: common utility functions
  • sample_generator.py: generation codes of synthetic sample.
  • l1.py : optimizer of l1 norm approximation (source: cvxopt)

Arugments of RobustSTL

  • input : input series
  • season_len : length of seasonal period
  • reg1 : first order regularization parameter for trend extraction
  • reg2 : second order regularization parameter for trend extraction
  • K : number of past season samples in seasonality extraction
  • H : number of neighborhood in seasonality extraction
  • dn1, dn2 : hyperparameter of bilateral filter in denoising step.
  • ds1, ds2 : hyperparameter of bilarteral filter in seasonality extraction step.

Shape of input sample

Basically, RobustSTL is for univariate time series sample.
However, this codes are available on multi-variate time series sample. (It apply the algorithm to each series, using multiprocessing) Each series have to have same time length.

  • Univariate Time Series: [Time] or [Time,1]
  • Multivariate Time Series: [N, Time] or [N, Time, 1]

Etc

The original paper has wrong notation in seasonality extraction.
The difference is log

robuststl's People

Contributors

leedoyup avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

robuststl's Issues

Trend seems to depend on the first value of sample

Hi,
I've tried to test your great code with other samples.
And I found the trend is always highly dependent on the first value of the samples
image
Note that since my sample is non-seasonal I manually set seasons_tilda as zeros
Do you have any idea?

Thanks

Update the numpy requirement

Today, I installed anaconda and updated all of its packages. I couldn't install with numpy==1.14.1 because of later dependencies caused failures. Changing this to numpy==1.18.1 in requirments.txt fixed the problem.

OS: MacOS 10.15.4 (19E287)

Any idea about this error?

Hello.

It would be great if you could support.

C:\XYZ\XYZ\RobustSTL.py:54: RuntimeWarning: invalid value encountered in double_scalars
season_value = np.sum(weight_sample * weights)/np.sum(weights)
[!] 2 iteration will strat

Intel MKL ERROR: Parameter 7 was incorrect on entry to DGELS.
Traceback (most recent call last):
File "", line 2, in
File "", line 16, in main
File "C:\XYZ\XYZ\RobustSTL.py", line 121, in RobustSTL
return _RobustSTL(input, season_len, reg1, reg2, K, H, dn1, dn2, ds1, ds2)
File "C:\XYZ\XYZ\RobustSTL.py", line 97, in _RobustSTL
trend_extraction(denoise_sample, season_len, reg1, reg2)
File "C:\XYZ\XYZ\RobustSTL.py", line 36, in trend_extraction
delta_trends = l1(P,q)
File "C:\XYZ\XYZ\l1.py", line 41, in l1
lapack.gels(+P, uls)
ValueError: -7

All the best

A.B.

Certificate of primal infeasibility found.

hello,
I have try to use RobustSTL() function to decompose a time series, but return error:

return sol['x'][:n]

TypeError: 'NoneType' object is not subscriptable.

Then I found that this error got from "sol = solvers.conelp(c, Fi, h, dims, kktsolver = Fkkt,
primalstart={'x': x0, 's': s0}, dualstart={'z': z0})" which return the error "Certificate of primal infeasibility found.".
If there is any method to solve this problem? I am looking forward to hearing from you, thanks.

Time series reconstruction

I am trying to run the reverse pipeline of RobustSTL method. That is, constructing the original time series based on the 3 components (Trend, seaso, Residue). Did you implement this by any chance?

issue from l1.py --- ValueError: Rank(A) < p or Rank([G; A]) < n

I have encounter this issue and I can I can not handle it.

Run results on Pycharm

 pcost       dcost       gap    pres   dres   k/t

0: 1.3224e+15 2.7939e+01 8e+14 8e-05 2e+00 1e+00
Traceback (most recent call last):
File "C:\Users\Admin\PycharmProjects\pythonProject\venv\lib\site-packages\cvxopt\coneprog.py", line 1067, in conelp
f3 = kktsolver(W)
File "C:\Users\Admin\PycharmProjects\RobustSTL\l1.py", line 25, in Fkkt
lapack.potrf(A)
ArithmeticError: 100

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/Admin/PycharmProjects/RobustSTL/main.py", line 42, in
main()
File "C:/Users/Admin/PycharmProjects/RobustSTL/main.py", line 23, in main
result = RobustSTL(sample_list[0], 50, reg1=10.0, reg2= 0.5, K=2, H=5, dn1=1., dn2=1., ds1=50., ds2=1.)
File "C:\Users\Admin\PycharmProjects\RobustSTL\RobustSTL.py", line 121, in RobustSTL
return _RobustSTL(input, season_len, reg1, reg2, K, H, dn1, dn2, ds1, ds2)
File "C:\Users\Admin\PycharmProjects\RobustSTL\RobustSTL.py", line 97, in _RobustSTL
trend_extraction(denoise_sample, season_len, reg1, reg2)
File "C:\Users\Admin\PycharmProjects\RobustSTL\RobustSTL.py", line 36, in trend_extraction
delta_trends = l1(P,q)
File "C:\Users\Admin\PycharmProjects\RobustSTL\l1.py", line 55, in l1
sol = solvers.conelp(c, Fi, h, dims, kktsolver = Fkkt,
File "C:\Users\Admin\PycharmProjects\pythonProject\venv\lib\site-packages\cvxopt\coneprog.py", line 1080, in conelp
raise ValueError("Rank(A) < p or Rank([G; A]) < n")
ValueError: Rank(A) < p or Rank([G; A]) < n

typo in run_example.ipynb

Running your example.ipynb I found a typo

from sample_generater import *

should be

from sample_generator import *

Bye

memory error

I got this error when used for other data using RobustSTL function. anyone can help, please
MemoryError: Unable to allocate 1.54 GiB for an array with shape (14399, 14399) and data type float64

How is RobustSTL used for prediction?

The seasonal part of STL is fixed, but the value of the current position of RobustSTL is related to the values of the previous cycles.
The current value will be used when calculating the weight, but there will be no current value in the prediction?

So after the decomposition, how should we make predictions?

Is there any way currently to deal with trendy only signal ?

Not all signals have necessarily a seasonal component, some might be only trendy and in this case how do we go about setting the <seasonal_length> parameter?

For example, what are the right settings for the classical airline passengers data set? I tried 12 as the seasonal_length, running the robustSTL confounds trend and seasonality !

unreachable code

I was looking through the code base. I have an issue, can you please tell me whether this line is reachable or not?
Since it is working in an infinite loop, so there must be a break statement so that the flow of control can go out of the infinite loop sequence, but I cannot find any such break in the code.

If I'm missing something, then can you please point it out to me.

Thanks a lot in advance.

Manish Devgan

Hi,DoYup !Function get_relative_trends has bug when sum delta_trends

image

def get_relative_trends(delta_trends):
    init_value = np.array([0])
    idxs = np.arange(len(delta_trends))
    relative_trends = np.array(list(map(lambda idx: np.sum(delta_trends[:idx]), idxs)))
    relative_trends = np.concatenate([init_value, relative_trends])
    return relative_trends

The code:

 relative_trends = np.array(list(map(lambda idx: np.sum(delta_trends[:idx]), idxs)))
np.sum(delta_trends[:idx])

should be

np.sum(delta_trends[:idx+1])

Because idxs is [0,1,2,3,4,5.....]
when idxs = 0 , np.sum(delta_trends[:idx]) will be zeros, and relative_trends must be [0,0,.....]

The bug would make trend delay one time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.