Code Monkey home page Code Monkey logo

Comments (5)

DsDev1 avatar DsDev1 commented on May 13, 2024

The expected behavior would be to use the last row (the recent time-step) of the dataset instead of the first one, and the rest would work as expected. The current workaround to this issue is to pass only single row not df to clearly know what is the input and the expected output.

from mlforecast.

iamyihwa avatar iamyihwa commented on May 13, 2024

Having same issue.
Created a X and y variable, where y variable is simply a slightly deviated value from X variable.
However the model cannot be fit at all using the external variable.
Again the first value of the training data has been going into the model always in the predict() function.
When using sklearn model for comparison, it fits very well. (Error : close to 0 % , mlforecast: around 30%)

Suggested Change:
In core.py
Replace df to another variable in line 465 and 466 (since it seems to be using the global df variable, and do 'merge' on 'unique_id' and 'ds' ?
image

@FedericoGarza @jose-moralez what do you think???

To replicate:


import pandas as pd

date_range = pd.date_range(start='2020-01-01', end='2021-01-01', freq='W')
df = pd.DataFrame(date_range, columns=['ds'])
df['unique_id'] = 1 
df['X'] = df['ds'].dt.year +  np.random.uniform( low = -1, high = 1, size= df.shape[0])*1000
df['y'] = df['X'] * 0.8 + np.random.rand()

cut_off = 20
train_df = df[:-cut_off]
train_df_X = train_df.loc[:,'X']
train_df_Y = train_df[['ds', 'y', 'unique_id']]
test_df = df[-cut_off:]
test_df_X = test_df.loc[:,['X', 'ds','unique_id']]
test_df_Y = test_df[['ds', 'y', 'unique_id']]
df.head()
from statsforecast import StatsForecast
StatsForecast.plot(df)
image
import lightgbm as lgb
from window_ops.expanding import expanding_mean
from window_ops.rolling import rolling_mean

from mlforecast import MLForecast
from sklearn.linear_model import LinearRegression

mlf = MLForecast(
    models = [LinearRegression(), lgb.LGBMRegressor()],
    #lags=[1, 12],
    #lag_transforms={
    #    1: [expanding_mean],
    #    12: [(rolling_mean, 48)],
    #},
    freq = 'W', 
    #date_features = ['week', 'month', 'year']
)
def inspect_input(x):
    from IPython.display import display
    print('inspect_input')
    display(x)
    return x
prep_df =mlf.preprocess(train_df, id_col = 'unique_id', time_col = 'ds', target_col = 'y')
mlf.fit(train_df, id_col = 'unique_id', time_col = 'ds', target_col = 'y')
y_hat = mlf.predict(cut_off,  dynamic_dfs = [test_df_X],  before_predict_callback=inspect_input)
 
image The input passed to the model is always the first value of the training dataset. image

Therefore resulting forecast is a flat one, not being able to get the information from external variable.

image

Just to compare, if I use a different model from sklearn, it fits well with almost 0 error value.

y_hat= test_df[['ds','unique_id']]
from sklearn.linear_model import Lasso, LinearRegression  
clf = LinearRegression()
clf.fit(train_df.loc[:,['X']], train_df.loc[:, 'y'])
y_hat_val = clf.predict(test_df.loc[:, ['X']])
y_hat['lr'] = y_hat_val


from sklearn.ensemble import GradientBoostingRegressor
clf = GradientBoostingRegressor()
clf.fit(train_df.loc[:,['X']], train_df.loc[:, 'y'])
y_hat_val = clf.predict(test_df.loc[:,['X']])
y_hat['gb'] = y_hat_val

StatsForecast.plot( test_df_Y, y_hat )
image image

Can also be seen with numbers in case you cannot see the fit well :
Error for the mlforecast model
image
Error from the sklearn model
image

from mlforecast.

jmoralez avatar jmoralez commented on May 13, 2024

Hi @iamyihwa, thanks for the thorough example. I believe this is the same as #122, the TLDR is that you have to set static_features=[] in your fit call, because right now mlforecast is treating X as a static feature and thus it only repeats its last value forward when predicting.

from mlforecast.

iamyihwa avatar iamyihwa commented on May 13, 2024

Thanks @jmoralez for your quick answer!
Yes!!! indeed it solves the issue!!

Thanks again to the team for providing great tool and also for the help!!!

from mlforecast.

jmoralez avatar jmoralez commented on May 13, 2024

Closing as it seems we've solved the original issue.

from mlforecast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.