Comments (24)
Now that notebook exports have been implemented I think we should bump up the priority of this one. Building the scikit-learn documentation takes ages and lazy reviewers tend to not check the rendering and the integration of example generated figures because of that. I have 12 cores on my workstation and only one is working at the moment...
from sphinx-gallery.
I dunno - I hear those joblib folks are a buncha jerks ;-)
(in seriousness, I'm +1 on a joblib dependency if it means avoiding a lot of multiprocessing complexity here...)
from sphinx-gallery.
I imagine it could pretty easily be made an optional dependency only used/imported if the default is changed from 1 to something else, so +1 from me.
from sphinx-gallery.
It depends how it is implemented. Using multiprocessing (/joblib), there should be no problem.
from sphinx-gallery.
Hi,
I think that this is not high priority. I would much more like to focus
on getting IPython notebook output.
from sphinx-gallery.
Agreed. FWIW I have some hacky lines of bash using GNU parallel to run the examples in parallel and make sure that they all work. For some reason I haven't fully investigated, the speed-up I got running the nilearn examples was 2x (i.e. roughly 10 minutes instead of 20 minutes) even on 4 cores.
All I am trying to say is that such a speed-up is not going to change your life that much.
from sphinx-gallery.
from sphinx-gallery.
FYI you can now run your own examples using make html_dev-pattern
, so some lines of bash
can be used to customize it for each repo. It's not automated, but it works e.g. for CircleCI.
from sphinx-gallery.
@lesteve this could probably be implemented with joblib.Parallel quite easily instead of using gnu parallel. One just need to make sure that the example building function does not return large python objects back to the main process but instead directly writes the output of the execution to disk. (e.g. images and joblib cached stuff).
from sphinx-gallery.
Now that notebook exports have been implemented I think we should bump up the priority of this one.
I wasn't aware that generating notebooks would take a lot of time ... are you confident that's the culprit here ?
I guess using joblib.Parallel would alleviate the problem seen in #57 (seaborn style set in one example was kept for all the other examples) since you could run each example in a separate process as mentioned in #140 (comment).
from sphinx-gallery.
That would be super useful for Matplotlib!
from sphinx-gallery.
I think this is now timely
from sphinx-gallery.
I've worked on this, and I have a working proof of concept using joblib. There is still a bunch of things I need to figure out such as how to get the number of jobs provided by the user to sphinx (which isn't documented at all…).
How do you guys feel about adding joblib as a dependency? Should I work only from the stdlib?
from sphinx-gallery.
@NelleV out of interest what is the kind of speed-up you get with multiprocessing ?
from sphinx-gallery.
from sphinx-gallery.
from sphinx-gallery.
I'll make it optional.
from sphinx-gallery.
(in particular here's the joblib checking code in MNE: https://github.com/mne-tools/mne-python/blob/master/mne/parallel.py#L77)
from sphinx-gallery.
hi @NelleV, do you still have that code somewhere ?
from sphinx-gallery.
Not easily available: I changed computer since, and I don't have the branch on github… Also, the code changed so much since then, that I'm pretty sure my code would be useless these days.
from sphinx-gallery.
I tried a simple approach to parallelize the loop in generate_dir_rst which iterates over the files within the same directory with ProcessPoolExecutor with no luck, maybe someone would want to check #877
from sphinx-gallery.
Thinking about this a bit more, I'd expect this not to work for (at least) the matplotlib, mayavi, and pyvista scrapers because these are all global-state based. And then there will be tricky interactions with reset_modules, which also by default does global state stuff at least for matplotlib. So I'm not sure this will ever work (easily) at least for the majority of our users :(
from sphinx-gallery.
Ahh right, I hadn't thought about that!
from sphinx-gallery.
ProcessPoolExecutor uses multiprocessing right ?
https://github.com/python/cpython/blob/3.10/Lib/concurrent/futures/process.py
from sphinx-gallery.
Related Issues (20)
- Rename README.rst to gallery_header.rst HOT 2
- How to use the only syntax based on sphinx gallery to output different documents according to labels HOT 4
- Order dependence with Jupyterlite HOT 6
- How does the only syntax take effect on titles HOT 3
- "Broken" gallery examples should still link to the code HOT 2
- Add option to set thumbnails for expected failing examples HOT 2
- Define CSS variables when `html[data-theme]` does not exist HOT 6
- Doc: document testing HOT 4
- [Maint] Add HTML parsing library as a testing dependency HOT 2
- Online documentation not deployed HOT 1
- Thumbnail generation is not working for plotly plots HOT 1
- ENH: remove file calling mini-gallery from mini-gallery
- Support for async code
- Missing full stop in download note.
- FEA Integrate download links and binder/juputerlite buttons with `pydata-sphinx-theme` secondary sidebar HOT 2
- `minigallery` with multiple files: deduplicate and should not start new rows HOT 5
- How many sublevels are supported? HOT 10
- Sub-folder gallery TOC header missing in PST theme HOT 6
- Get access to comment-based configuration in the scraper HOT 3
- Scrapper for Sympy output HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sphinx-gallery.