Code Monkey home page Code Monkey logo

Comments (18)

ninedigits avatar ninedigits commented on July 19, 2024 1

@marrov, @ninedigits, what happens if you nest joblib Parallel calls?

Will it do the sensible thing, or something problematic? E.g., if the two n_jobs multiply to a number that's less than your cores, will it do the sensible thing?

I'm not 100% certain this is the issue, but I suspect that having both parallel options enabled may cause the performance to behave unexpectedly. I've run into this issue before. A quick check would be to run htop to monitor core performance while executing the job. It should look like this:

image

Cores should max out at 100%. Anything over means performance degradation.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

A possible explanation is that the distribution of runtimes is very heavy tailed.

For instance, let's assume that most runs run in a few seconds or less, but there's one run that has 4 minutes runtime because the optimizer gets stuck or similar. This one run has to complete.

Seems implausible though, because if it were stochastic, it would likely result in more runs like this, so one would expect stlil a speedup to half or less.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

Also, I'm getting a lot of warnings -

FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.

Perhaps that is the reason?

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

On my local machine, I find 4min 41 sec, vs 6m 48 sec. I have 6 cores.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

@ninedigits noted that is invoking joblib Parallel in a nested fashion.


His full answer from #6216:

@fkiraly @marrov

I'm not sure if this is the sole issue here, but I do see at least one problem with your setup. You're running a multiprocess job on top of another multiprocess job.

("autoets", AutoETS(auto=True, sp=12, n_jobs=-1)),

...

model_config = {
    "backend:parallel": "joblib",
    "backend:parallel:params": {"backend": "loky", "n_jobs": -1},
}

model.set_config(**model_config)

This might be locking up some of your cores or degrading performance. This is the change I made:

("autoets", AutoETS(auto=True, sp=12, n_jobs=1))

When I run the code with this line it takes about 2 min to complete with multiprocessing enabled and 13 minutes without. I can dig a bit deeper to see if there's other issues. When I do my multiprocessing, I usually use the multiprocessing library. I can try this setup to see if there are any performance differences.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

@marrov, @ninedigits, what happens if you nest joblib Parallel calls?

Will it do the sensible thing, or something problematic? E.g., if the two n_jobs multiply to a number that's less than your cores, will it do the sensible thing?

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

Here's a post by @ogrisel on the topic, he says it should work:

joblib/joblib#842 (comment)

Not sure whether the conditions are satisfied - there is no express release of GIL, and I do not know what happens by default.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

why are some of your zeroes red?

from sktime.

ninedigits avatar ninedigits commented on July 19, 2024

Might be that some of my cores are maxed out, or that I have a million tabs open and a bunch of other programs sitting in the background 🤷🏻‍♂️

from sktime.

ninedigits avatar ninedigits commented on July 19, 2024

Side question -- does the parallel compute config option have a progress bar indicator? ForecastByLevel could potentially replace a huge section of code that I have to run parallel jobs, but I like having a progress bar in case things lock up.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

Hm, what would the progress bar do?

If you let me know how you get the progress bar for joblib, I can point you to the locations.

If it is coming from Parallel, you can pass any args via the config field "backend:parallel:params".

from sktime.

ninedigits avatar ninedigits commented on July 19, 2024

I'd have to dig into joblib, but at least for the multiprocessing module, the setup would look something like this:

pbar = tqdm.tqdm(total=len(args))

with multiprocessing.Pool(processes=10) as pool:
    for _ in pool.imap(work, args):
        pbar.update()

It can be helpful to understand where a script is in its execution, or if it fails, where it's failing.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

@ninedigits, you can pass a verbose argument to joblib Parallel (via backend:parallel:params) - would that not do the same thing? Have you used that?

from sktime.

ninedigits avatar ninedigits commented on July 19, 2024

@fkiraly I think this one can be closed right? Parallelization works on my end.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

Yes, I suppose so, as it turned out to not be a bug.
Please reopen if you think otherwise, @marrov.

from sktime.

marrov avatar marrov commented on July 19, 2024

Also, I'm getting a lot of warnings -

FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.

Perhaps that is the reason?

Indeed the base error turned out not to be a bug, but this is still an issue and actually I am not able to remove those warnings! I tried many different ways but I'm not really able to - it seems like it's coming from _fh.py. Maybe this should be opened as a separate issue @fkiraly?

from sktime.

ninedigits avatar ninedigits commented on July 19, 2024

Also, I'm getting a lot of warnings -

FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.

Perhaps that is the reason?

Indeed the base error turned out not to be a bug, but this is still an issue and actually I am not able to remove those warnings! I tried many different ways but I'm not really able to - it seems like it's coming from _fh.py. Maybe this should be opened as a separate issue

Also getting those errors. I believe it's coming from pandas and it can't be muted because it's using a separate warning system.

from sktime.

fkiraly avatar fkiraly commented on July 19, 2024

yes, we should address this - there already is an issue here:
#6245

from sktime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.