Code Monkey home page Code Monkey logo

Comments (8)

fkiraly avatar fkiraly commented on June 15, 2024

How odd.

Strictly speaking it is superfluous to wrap the TransformedTargetForecaster in ForecatsingPipeline, but that should not impact behaviour (it just should be ignored, because it is a single elemen tpipeline).

I can see the issue: it is the axis=1 argument. If return_components=True, then the result will be a pd.DataFrame because it is multivariate. If it is False, then it is a pd.Series.

From a methodological standpoint, the inverse works correctly only in the return_components=True case, if the seasonal component is also forecast.

I see that one would expect that the seasonal component is continued periodically and added back.

So, returning X if pd.Series would remove the exception, but lead to unexpected behaviour, as the seasonal components are not added back.

Perchance, do you know, @eangius, is there an easy way to get an extrapolated form of all seasonal components in statsmodels MSTL? There should be?

Also FYI @luca-miniati who is the author and maintainer.

from sktime.

fkiraly avatar fkiraly commented on June 15, 2024

To clarify, I think inverse_transform should do:

  • if return_components=True, exactly what it does currently - in this case it needs forecasters for all seasonal components and the residual if used in a pipeline
  • if return_components=False, a naive periodic continuation should be made for seasonal components, and it should be added to the transformed values. This might be slightly challenging, given that the index seen in _inverse_transform need not be contiguous with, or could intersect with, the index seen in fit.

from sktime.

fkiraly avatar fkiraly commented on June 15, 2024

updated the issue title - imo the root cause is that MSTL.inverse_transform fails whenever return_components=False

from sktime.

fkiraly avatar fkiraly commented on June 15, 2024

there is also a wider issue, namely that inverse_transform test coverage seems insufficient to detect this, which should be investigated.

from sktime.

luca-miniati avatar luca-miniati commented on June 15, 2024

Hi Franz, long time no see! I'd like to implement the functionality for return_components=False.

Let me know if I understand the solution correctly:

  • make predictions of the seasonal time series, using the provided fh
  • add up all the values, and return as pd.Series

And a clarifying question: why would the index seen in fit potentially not match the index of _inverse_transform?

from sktime.

eangius avatar eangius commented on June 15, 2024

Thanks for the quick diagnostic @fkiraly. Unfortunately Iā€™m still a knob at statsmodel to tell how to extract all seasonal components..

For context, we are wrapping MSTL into a TransformedTargetForecaster because we have a previous processing step for the exogenous variables but that was not relevant to reproduce the problem.

As an extra bit of context, we tried with return_components=True and filtering out the other columns in a FunctionTransformer to keep things univariate but that gave us a different type of exception..

from sktime.

fkiraly avatar fkiraly commented on June 15, 2024

@eangius, as possible workarounds for de/re-trending in a pipeline:

  • you can pipeline multiple Deseasonalizer-s, like Deseasonalizer(sp=24) * Deseasonalizer(sp=24*7) * my_forecaster for daily and weekly (if your data is hourly
  • you can try StatsforecastMSTL, this is a forecaster that is optimized and with integrated MSTL, though with a heavier dependency footprint

from sktime.

fkiraly avatar fkiraly commented on June 15, 2024

Hi Franz, long time no see!

Nice to hear from you again, as well!

I'd like to implement the functionality for return_components=False.

Great, let me know if I can help.

Let me know if I understand the solution correctly:

  • make predictions of the seasonal time series, using the provided fh

  • add up all the values, and return as pd.Series

Yes, this should happen when it is pipelined with a forecaster.

Though, the MSTL estimator is a transformer, so the transformer needs to carry out the transformation steps only.

So, we need to take the indices in _inverse_transform, and determine the periodic pattern implied by what was fitted on fit.

And a clarifying question: why would the index seen in fit potentially not match the index of _inverse_transform?

If you work out what happens in a forecasting pipeline, the transformer gets the historic indices in fit, e.g., 0, 1, 2, ..., 100, and the indices corresponding to the fh in predict, fore a fh of 1, 2, 3, the X in _inverse_transform would have index 101, 102, 103.

If we have patterns of periodicities 3, 5, 7, denoting the indices of the periodic patterm by 3-0, 3-1, 3-2; 5-0, 5-1, ..., 5-4; 7-0, ..., 7-6, (dashes just for notation, not "minus") then for incides 101, 102, 103 we should forecast, for components, the indices 3-2, 3-0, 3-1; 5-1, 5-2, 5-3; 7-4, 7-5, 7-6.
(in python, we start counting with 0, so X-1 maps onto any index divisible without remainder by X)

I think this already must be done somewhere in transform if return_components=True?

from sktime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.