Originally posted by <a class="user-mention notranslate" data-hovercard-type="user

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Here's a post by <a class="user-mention notranslate" data-hovercard-type="user" data-h

[BUG] `ForecastByLevel` with native parallelization is not faster than without about sktime HOT 18 CLOSED

fkiraly commented on July 19, 2024

[BUG] `ForecastByLevel` with native parallelization is not faster than without

from sktime.

Comments (18)

ninedigits commented on July 19, 2024 1

@marrov, @ninedigits, what happens if you nest joblib Parallel calls?

Will it do the sensible thing, or something problematic? E.g., if the two n_jobs multiply to a number that's less than your cores, will it do the sensible thing?

I'm not 100% certain this is the issue, but I suspect that having both parallel options enabled may cause the performance to behave unexpectedly. I've run into this issue before. A quick check would be to run htop to monitor core performance while executing the job. It should look like this:

Cores should max out at 100%. Anything over means performance degradation.

from sktime.

fkiraly commented on July 19, 2024

A possible explanation is that the distribution of runtimes is very heavy tailed.

For instance, let's assume that most runs run in a few seconds or less, but there's one run that has 4 minutes runtime because the optimizer gets stuck or similar. This one run has to complete.

Seems implausible though, because if it were stochastic, it would likely result in more runs like this, so one would expect stlil a speedup to half or less.

from sktime.

fkiraly commented on July 19, 2024

Also, I'm getting a lot of warnings -

FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.

Perhaps that is the reason?

from sktime.

fkiraly commented on July 19, 2024

On my local machine, I find 4min 41 sec, vs 6m 48 sec. I have 6 cores.

from sktime.

fkiraly commented on July 19, 2024

@ninedigits noted that is invoking joblib Parallel in a nested fashion.

His full answer from #6216:

@fkiraly @marrov

I'm not sure if this is the sole issue here, but I do see at least one problem with your setup. You're running a multiprocess job on top of another multiprocess job.

("autoets", AutoETS(auto=True, sp=12, n_jobs=-1)),

...

model_config = {
    "backend:parallel": "joblib",
    "backend:parallel:params": {"backend": "loky", "n_jobs": -1},
}

model.set_config(**model_config)

This might be locking up some of your cores or degrading performance. This is the change I made:

("autoets", AutoETS(auto=True, sp=12, n_jobs=1))

When I run the code with this line it takes about 2 min to complete with multiprocessing enabled and 13 minutes without. I can dig a bit deeper to see if there's other issues. When I do my multiprocessing, I usually use the multiprocessing library. I can try this setup to see if there are any performance differences.

from sktime.

fkiraly commented on July 19, 2024

@marrov, @ninedigits, what happens if you nest joblib Parallel calls?

Will it do the sensible thing, or something problematic? E.g., if the two n_jobs multiply to a number that's less than your cores, will it do the sensible thing?

from sktime.

fkiraly commented on July 19, 2024

Here's a post by @ogrisel on the topic, he says it should work:

joblib/joblib#842 (comment)

Not sure whether the conditions are satisfied - there is no express release of GIL, and I do not know what happens by default.

from sktime.

fkiraly commented on July 19, 2024

why are some of your zeroes red?

from sktime.

ninedigits commented on July 19, 2024

Might be that some of my cores are maxed out, or that I have a million tabs open and a bunch of other programs sitting in the background 🤷🏻‍♂️

from sktime.

ninedigits commented on July 19, 2024

Side question -- does the parallel compute config option have a progress bar indicator? ForecastByLevel could potentially replace a huge section of code that I have to run parallel jobs, but I like having a progress bar in case things lock up.

from sktime.

fkiraly commented on July 19, 2024

Hm, what would the progress bar do?

If you let me know how you get the progress bar for joblib, I can point you to the locations.

If it is coming from Parallel, you can pass any args via the config field "backend:parallel:params".

from sktime.

ninedigits commented on July 19, 2024

I'd have to dig into joblib, but at least for the multiprocessing module, the setup would look something like this:

pbar = tqdm.tqdm(total=len(args))

with multiprocessing.Pool(processes=10) as pool:
    for _ in pool.imap(work, args):
        pbar.update()

It can be helpful to understand where a script is in its execution, or if it fails, where it's failing.

from sktime.

fkiraly commented on July 19, 2024

@ninedigits, you can pass a verbose argument to joblib Parallel (via backend:parallel:params) - would that not do the same thing? Have you used that?

from sktime.

ninedigits commented on July 19, 2024

@fkiraly I think this one can be closed right? Parallelization works on my end.

from sktime.

fkiraly commented on July 19, 2024

Yes, I suppose so, as it turned out to not be a bug.
Please reopen if you think otherwise, @marrov.

from sktime.

marrov commented on July 19, 2024

Also, I'm getting a lot of warnings -

FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.

Perhaps that is the reason?

Indeed the base error turned out not to be a bug, but this is still an issue and actually I am not able to remove those warnings! I tried many different ways but I'm not really able to - it seems like it's coming from _fh.py. Maybe this should be opened as a separate issue @fkiraly?

from sktime.

ninedigits commented on July 19, 2024

Also, I'm getting a lot of warnings -

FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.

Perhaps that is the reason?

Indeed the base error turned out not to be a bug, but this is still an issue and actually I am not able to remove those warnings! I tried many different ways but I'm not really able to - it seems like it's coming from _fh.py. Maybe this should be opened as a separate issue

Also getting those errors. I believe it's coming from pandas and it can't be muted because it's using a separate warning system.

from sktime.

fkiraly commented on July 19, 2024

yes, we should address this - there already is an issue here:
#6245

from sktime.

[BUG] `ForecastByLevel` with native parallelization is not faster than without about sktime HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent