Code Monkey home page Code Monkey logo

Comments (24)

ogrisel avatar ogrisel commented on June 12, 2024 1

Now that notebook exports have been implemented I think we should bump up the priority of this one. Building the scikit-learn documentation takes ages and lazy reviewers tend to not check the rendering and the integration of example generated figures because of that. I have 12 cores on my workstation and only one is working at the moment...

from sphinx-gallery.

choldgraf avatar choldgraf commented on June 12, 2024 1

I dunno - I hear those joblib folks are a buncha jerks ;-)

(in seriousness, I'm +1 on a joblib dependency if it means avoiding a lot of multiprocessing complexity here...)

from sphinx-gallery.

larsoner avatar larsoner commented on June 12, 2024 1

I imagine it could pretty easily be made an optional dependency only used/imported if the default is changed from 1 to something else, so +1 from me.

from sphinx-gallery.

NelleV avatar NelleV commented on June 12, 2024 1

It depends how it is implemented. Using multiprocessing (/joblib), there should be no problem.

from sphinx-gallery.

GaelVaroquaux avatar GaelVaroquaux commented on June 12, 2024

Hi,

I think that this is not high priority. I would much more like to focus
on getting IPython notebook output.

from sphinx-gallery.

lesteve avatar lesteve commented on June 12, 2024

Agreed. FWIW I have some hacky lines of bash using GNU parallel to run the examples in parallel and make sure that they all work. For some reason I haven't fully investigated, the speed-up I got running the nilearn examples was 2x (i.e. roughly 10 minutes instead of 20 minutes) even on 4 cores.

All I am trying to say is that such a speed-up is not going to change your life that much.

from sphinx-gallery.

GaelVaroquaux avatar GaelVaroquaux commented on June 12, 2024

from sphinx-gallery.

larsoner avatar larsoner commented on June 12, 2024

FYI you can now run your own examples using make html_dev-pattern, so some lines of bash can be used to customize it for each repo. It's not automated, but it works e.g. for CircleCI.

from sphinx-gallery.

ogrisel avatar ogrisel commented on June 12, 2024

@lesteve this could probably be implemented with joblib.Parallel quite easily instead of using gnu parallel. One just need to make sure that the example building function does not return large python objects back to the main process but instead directly writes the output of the execution to disk. (e.g. images and joblib cached stuff).

from sphinx-gallery.

lesteve avatar lesteve commented on June 12, 2024

Now that notebook exports have been implemented I think we should bump up the priority of this one.

I wasn't aware that generating notebooks would take a lot of time ... are you confident that's the culprit here ?

I guess using joblib.Parallel would alleviate the problem seen in #57 (seaborn style set in one example was kept for all the other examples) since you could run each example in a separate process as mentioned in #140 (comment).

from sphinx-gallery.

NelleV avatar NelleV commented on June 12, 2024

That would be super useful for Matplotlib!

from sphinx-gallery.

agramfort avatar agramfort commented on June 12, 2024

I think this is now timely

from sphinx-gallery.

NelleV avatar NelleV commented on June 12, 2024

I've worked on this, and I have a working proof of concept using joblib. There is still a bunch of things I need to figure out such as how to get the number of jobs provided by the user to sphinx (which isn't documented at all…).
How do you guys feel about adding joblib as a dependency? Should I work only from the stdlib?

from sphinx-gallery.

lesteve avatar lesteve commented on June 12, 2024

@NelleV out of interest what is the kind of speed-up you get with multiprocessing ?

from sphinx-gallery.

GaelVaroquaux avatar GaelVaroquaux commented on June 12, 2024

from sphinx-gallery.

agramfort avatar agramfort commented on June 12, 2024

from sphinx-gallery.

NelleV avatar NelleV commented on June 12, 2024

I'll make it optional.

from sphinx-gallery.

choldgraf avatar choldgraf commented on June 12, 2024

(in particular here's the joblib checking code in MNE: https://github.com/mne-tools/mne-python/blob/master/mne/parallel.py#L77)

from sphinx-gallery.

jschueller avatar jschueller commented on June 12, 2024

hi @NelleV, do you still have that code somewhere ?

from sphinx-gallery.

NelleV avatar NelleV commented on June 12, 2024

Not easily available: I changed computer since, and I don't have the branch on github… Also, the code changed so much since then, that I'm pretty sure my code would be useless these days.

from sphinx-gallery.

jschueller avatar jschueller commented on June 12, 2024

I tried a simple approach to parallelize the loop in generate_dir_rst which iterates over the files within the same directory with ProcessPoolExecutor with no luck, maybe someone would want to check #877

from sphinx-gallery.

larsoner avatar larsoner commented on June 12, 2024

Thinking about this a bit more, I'd expect this not to work for (at least) the matplotlib, mayavi, and pyvista scrapers because these are all global-state based. And then there will be tricky interactions with reset_modules, which also by default does global state stuff at least for matplotlib. So I'm not sure this will ever work (easily) at least for the majority of our users :(

from sphinx-gallery.

larsoner avatar larsoner commented on June 12, 2024

Ahh right, I hadn't thought about that!

from sphinx-gallery.

jschueller avatar jschueller commented on June 12, 2024

ProcessPoolExecutor uses multiprocessing right ?
https://github.com/python/cpython/blob/3.10/Lib/concurrent/futures/process.py

from sphinx-gallery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.