Code Monkey home page Code Monkey logo

pm-prophet's Introduction

pm-prophet

Logo

Pymc3-based universal time series prediction and decomposition library (inspired by Facebook Prophet). However, while Faceook prophet is a well-defined model, pm-prophet allows for total flexibility in the choice of priors and thus is potentially suited for a wider class of estimation problems.

⚠️ Only supports Python 3

Table of Contents

Installing pm-prophet

PM-Prophet installation is straightforward using pip: pip install pmprophet

Note that the key dependency of pm-prophet is PyMc3 a library that depends on Theano.

Key Features

  • Nowcasting & Forecasting
  • Intercept, growth
  • Regressors
  • Holidays
  • Additive & multiplicative seasonality
  • Fitting and plotting
  • Custom choice of priors (not in Facebook's prophet original model)
  • Changepoints in growth
  • Automatic changepoint location detection (not in Facebook's prophet original model)
  • Fitting with NUTS/AVDI/Metropolis

Experimental warning ⚠️

  • Note that automatic changepoint detection is experimental

Differences with Prophet:

  • Saturating growth is not implemented
  • Uncertainty estimation is different
  • All components (including seasonality) need to be explicitly added to the model
  • By design pm-prophet places a big emphasis on posteriors and uncertainty estimates, and therefore it won't use MAP for it's estimates.
  • While Faceook prophet is a well-defined fixed model, pm-prophet allows for total flexibility in the choice of priors and thus is potentially suited for a wider class of estimation problems

Peyton Manning example

Predicting the Peyton Manning timeseries:

import pandas as pd
from pmprophet.model import PMProphet, Sampler

df = pd.read_csv("examples/example_wp_log_peyton_manning.csv")
df = df.head(180)

# Fit both growth and intercept
m = PMProphet(df, growth=True, intercept=True, n_changepoints=25, changepoints_prior_scale=.01, name='model')

# Add monthly seasonality (order: 3)
m.add_seasonality(seasonality=30, fourier_order=3)

# Add weekly seasonality (order: 3)
m.add_seasonality(seasonality=7, fourier_order=3)

# Fit the model (using NUTS)
m.fit(method=Sampler.NUTS)

ddf = m.predict(60, alpha=0.2, include_history=True, plot=True)
m.plot_components(
    intercept=False,
)

Model Seasonality-7 Seasonality-30 Growth Change Points

Custom Priors

One of the main reason why PMProphet was built is to allow custom priors for the modeling.

The default priors are:

Variable Prior Parameters
regressors Laplace loc:0, scale:2.5
holidays Laplace loc:0, scale:2.5
seasonality Laplace loc:0, scale:0.05
growth Laplace loc:0, scale:10
changepoints Laplace loc:0, scale:2.5
intercept Normal loc:y.mean, scale: 2 * y.std
sigma Half Cauchy tau:10

But you can change model priors by inspecting and modifying the distributions stored in

m.priors

which is a dictionary of {prior: pymc3-distribution}.

In the example below we will model an additive time-series by imposing a "positive coefficients" constraint by using an Exponential distribution instead of a Laplacian distribution for the regressors.

import pandas as pd
import numpy as np
import pymc3 as pm
from pmprophet.model import PMProphet, Sampler

n_timesteps = 100
n_regressors = 20

regressors = np.random.normal(size=(n_timesteps, n_regressors))
coeffs = np.random.exponential(size=n_regressors) + np.random.normal(size=n_regressors)
# Note that min(coeffs) could be negative due to the white noise

regressors_names = [str(i) for i in range(n_regressors)]

df = pd.DataFrame()
df['y'] = np.dot(regressors, coeffs)
df['ds'] = pd.date_range('2017-01-01', periods=n_timesteps)
for idx, regressor in enumerate(regressors_names):
    df[regressor] = regressors[:, idx]

m = PMProphet(df, growth=False, intercept=False, n_changepoints=0, name='model')

with m.model:
    # Remember to suffix _<model-name> to the custom priors
    m.priors['regressors'] = pm.Exponential('regressors_%s' % m.name, 1, shape=n_regressors)

for regressor in regressors_names:
    m.add_regressor(regressor)

m.fit(
    draws=10 ** 4,
    method=Sampler.NUTS,
)
m.plot_components()

Regressors

Automatic changepoint detection (⚠️experimental)

Pm-prophet is equipped with a non-parametric truncated Dirichlet Process allowing it to automatically detect changepoints in the trend.

To enable it simply initialize the model with auto_changepoints=True as follows:

from pmprophet.model import PMProphet, Sampler
import pandas as pd

df = pd.read_csv("examples/example_wp_log_peyton_manning.csv")
df = df.head(180)
m = PMProphet(df, auto_changepoints=True, growth=True, intercept=True, name='model')
m.fit(method=Sampler.METROPOLIS, draws=2000)
m.predict(60, alpha=0.2, include_history=True, plot=True)
m.plot_components(
    intercept=False,
)

Where n_changepoints is interpreted as the truncation point for the Dirichlet Process.

Pm-prophet will then decide which changepoint values make sense and add a custom weight to them. A call to plot_components() will reveal the changepoint map:

Regressors

A few caveats exist:

  • It's slow to fit since it's a non-parametric model
  • For best results use NUTS as method
  • It will likely require more than the default number of draws to converge

pm-prophet's People

Contributors

jpreszler avatar luke14free avatar t8ch avatar twiecki avatar twolodzko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pm-prophet's Issues

AttributeError when fitting example model

Hi and thank you for this very useful project! It's been a while since I wanted to give it a try and I finally found a use case :)
However, trying to get started, I couldn't run the model you have in the examples: when hitting m.fit(method='NUTS') I encountered:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-f9eeec8c375f> in <module>
      1 # Fit the model (using NUTS)
----> 2 m.fit(method='NUTS')

~/opt/anaconda3/envs/fund/lib/python3.7/site-packages/pmprophet/model.py in fit(self, draws, chains, trace_size, method, map_initialization, finalize, step_kwargs, sample_kwargs)
    610             if draws:
    611                 if method != Sampler.ADVI:
--> 612                     step_method = method.value(**step_kwargs)
    613                     self.trace = pm.sample(
    614                         draws,

AttributeError: 'str' object has no attribute 'value'

I also tried on the dataframe I'm trying to fit and it has the same behavior. Does it look like an issue? Or could it come from my installation? I'm running on pm-prophet 0.2.8, PyMC3 master and theano 1.0.4.
Thanks in advance for your help, and thanks again for your work!

AttributeError: 'str' object has no attribute 'value'

Trying to run the same simple model with version 0.2.1, I get the error below.

m = PMProphet(df_city, name='model')

# Fit the model (using NUTS)
m.fit(method='NUTS')
~/.pyenv/versions/3.6.7/lib/python3.6/site-packages/pmprophet/model.py in fit(self, draws, chains, trace_size, method, map_initialization, finalize, step_kwargs, sample_kwargs)
    610             if draws:
    611                 if method != Sampler.ADVI:
--> 612                     step_method = method.value(**step_kwargs)
    613                     self.trace = pm.sample(
    614                         draws,

AttributeError: 'str' object has no attribute 'value'

UnboundLocalError: local variable 'w' referenced before assignment

I tried to run a very simple model with weekly data from a pandas dataframe and get the below error. I'm not sure how to troubleshoot this. I get the same error if I add more explicit params for changepoints etc.

FYI, this exact dataset has been fit with fbprophet successfully.

Version: pmprophet-0.2
Python: Python 3.6.7

m = PMProphet(df_city, name='model')
m.fit(method='NUTS')
~/.pyenv/versions/3.6.7/lib/python3.6/site-packages/pmprophet/model.py in generate_priors(self)
    243                 else:
    244                     k = len(self.changepoints)
--> 245                 cgpt = pm.Deterministic('cgpt', w * pm.Laplace('cgpt_inner', 0, self.changepoints_prior_scale, shape=k))
    246                 self.priors['changepoints'] = pm.Deterministic('changepoints_%s' % self.name, cgpt)
    247             if self.intercept and 'intercept' not in self.priors:

UnboundLocalError: local variable 'w' referenced before assignment

Updated readme for project direction?

As it is, it is hard to gain a sense of where this project is going, or what needs to still be done (and what exactly has been accomplished), which an updated readme would solve.

Not only would this improve readability, but help others help out with contributions as well.

AttributeError: 'DataFrame' object has no attribute 'ix'

Hi-- first off. thank you for making this. i have been looking for a pymc3 based prophet alternative for a while (not a fan of the installation process for pystan)..

I am trying to run this peyton manning example from the read me, and the underlying module is using the deprecated ix function. Any way we could upgrade the module for the latest version of pandas? Ive never submitted a pull requests but I can potentially help.

auto_changepoint issue

Hi!
I am trying to use the pm-prophet feature auto_changepoints to detect changepoints in a dataset. It gives seemingly good results, but when running the script on the same dataset twice, the inferred changepoints are located different places. How can i make the auto_changepoints more accurate? does this have to do with my sample size?

Example code not working in Python 2.7

I am trying to reproduce the example using example_wp_log_peyton_manning.csv with Python 2.7. I have all of the correct dependencies from the requirements file, but I keep getting the error

ValueError: Mass matrix contains zeros on the diagonal. 
The derivative of RV `intercept_model`.ravel()[0] is zero.
The derivative of RV `changepoints_model`.ravel()[0] is zero.
The derivative of RV `seasonality_model`.ravel()[2] is zero.
The derivative of RV `seasonality_model`.ravel()[3] is zero.
The derivative of RV `seasonality_model`.ravel()[4] is zero.
The derivative of RV `seasonality_model`.ravel()[5] is zero.
The derivative of RV `sigma_model_log__`.ravel()[0] is zero.

The code I used is attached with references to matplotlib removed
test.txt
with versions:
theano == 1.0.3
numpy == 1.15.3
pandas == 0.23.4
pymc3 == 3.5

Multiple extra data

Hello,
I use PMProphet with several features. Since I don't know the future value of the features, I generate a lot of scenarios. Is there any way to pass all these scenarios to model so that the uncertainty of the result consists of two (uncertainty of the model coefficients and uncertainty of the features)? Or is there any way to introduce such functionality using the capabilities of pymc3 (without making a loop through the scenarios) ?

Non-Linear Forecasting with pm-prophet

Hi,
I am trying to use pm-prophet to forecast data with little seasonality. When using the predict function, I would assume that the forecast gets less and less certain over time, some kind of diverging confidence interval. It however seems to me there is only a linear forecast being performed. Does the pm-prophet support what I am looking for?

Here is my code for convenience:
sd = .1
N = 200
deltas = np.random.normal(scale=sd, size=N)
y = np.cumsum(deltas)
x = np.arange(N)
df = pd.DataFrame() ## label
df["ds"] = pd.date_range(start="2018-01-01", periods=200)
df["y"] = y
print(df)
m = PMProphet(df, auto_changepoints=True, growth=True, name="model")
m.fit(method=Sampler.METROPOLIS, chains=1, draws=2000)
pm.traceplot(m.trace)
plt.show()
m.predict(50, alpha=0.2, include_history=True, plot=True)
m.plot_components(intercept=False)

plt.show()

image

Categorical Variables

can you package work with multivariate categorical time series and multivariate time series when some features are categorical and some some features are continues

[Feature Request] Simplify API for regressors

  1. Can the prior definition for each regressor be moved within the add_regressor() method by passing the distribution object as a parameter?

  2. Can a regressor_coefficients() method be created similar to what Prophet has to get the distribution of coefficients?

Heteroskedasticity

Hi,

Great package. I was considering attempting something similar, but was very grateful to find your implementation!

I'm modelling data that is generally periodic and trending but is heteroskedastic and rather than transform the heteroskedasticity out of the data as is usually recommended, I would like to model it specifically. My sense is that the GAM approach used by Prophet could model the data very well if I could model the varying variance explicitly.

My initial desire is to try to model the variance itself, as opposed to the original timeseries, and see if I can identify the points where the variance changes.

To that end, I'm trying to better understand how the truncated dirichlet process used herein and copied below accomplishes this. In particular, I don't understand the use of tt.extra_ops.cumprod(1 - beta)[:-1] is this the stickbreaking function?

Also, how does the switch work exactly at tt.switch(tt.gt(x, 1e-4), x, 0)

                    k = self.n_changepoints
                    alpha = pm.Gamma("alpha_%s" % self.name, 1.0, 1.0)
                    beta = pm.Beta("beta_%s" % self.name, 1.0, alpha, shape=k)
                    w1 = pm.Deterministic(
                        "w1_%s" % self.name,
                        tt.concatenate([[1], tt.extra_ops.cumprod(1 - beta)[:-1]])
                        * beta,
                    )
                    w, _ = theano.map(
                        fn=lambda x: tt.switch(tt.gt(x, 1e-4), x, 0), sequences=[w1]
                    )
                    self.w = pm.Deterministic("w_%s" % self.name, w)

can I access the probabilities for a given time-step being a changepoint after sampling?

Thanks in advance

Is GPU included?

Hello Developers of this wonderful repository,
I have tried samples. It looks good. But I haven't got any settings for the usage of GPU. It works with the CPU only. What if I am willing to go with the GPUs in my system for the training?
Can you help and guide me?

additive_seasonality *= self.data.y.max()

I've been reading through the code and I met this line at model.py, line 293:
additive_seasonality *= self.data.y.max()
and in a few other places around the code. Could anyone comment on its exact purpose?

This seems to me to be related to data standartisation, but data standartisation is also done later, in the model specification:
observed=(self.data['y'] - self.data['y'].mean()) / self.data['y'].std()

At the very least, shouldn't it be
additive_seasonality *= self.data.y.abs().max()
to account for the possiblity that, say, range of y is [-1,0] and thus y.max()==0 ?

fourier_series coefficients

Hi!
I have been working with the source code to pm-prophet for a while, and I am wondering about how the fourier-series and the accompanying coefficients a1,b1,a2,b2,....,an,bn in the standard fourier series are sampled. As far as I can tell, the fourier_series() function returns both sine and cosine series with length of input data t, but self.data["f%s_%s"%(seasonality,order_idx)]_ only uses the sine series. As an example, a fourier_series of order 4 should have 8 self.priors["seasonality"] -RVs but has only 4.

How is this fourier series modelled?
Thank you in advance!

Example not working on Pandas 1.0

I am really excited about this library but cannot get it to work past fitting the model because of an error where it says "AttributeError": Dataframe object has no attribute 'ix'.

I have tried downgrading pandas to .19 and .23 but still won't work. Would love to get this working.

Screen Shot 2020-04-30 at 9 14 23 AM

prediction dependent on the order of the added sesonality

hi,
I have discovered that altering the order of which I add the weekly and monthly seasonalities to my model affects the outcome. The two figures below show weekly added before monthly, and monthly added before weekly. What causes this order depencency?

7then30
30then7

PyMC3 has been renamed PyMC

Hi,
PyMC3 has been renamed PyMC. If this affects you and you have questions, or you want someone to direct your rage at I'm available! Do let me know how i, or any of the PyMC devs can help.

Ravin

question

Hi
How is this different from FBProphet?
Is there some reference document available for this? Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.