Comments (6)
Ideas on how to deliniate what to document
I think it could be good to try draw a line between setting starting dask-gateway clusters etc and working against it. Setting up a dask-gateway cluster for a user in a 2i2c hub means to use a dask-gateway client to request its creation and various details.
Using the dask-gateway created dask cluster shouldn't require dask-gateway specific details, its just typical dask work against a scheduler + workers - they just happened to be created via dask-gateway.
The docs of relevance can be grouped in:
- Use of dask-gateway client in 2i2c hubs to create dask clusters
This could be docs in dask/dask-gateway to some degree, but not fully, because we pre-configure things that isn't obvious you would pre-configure - such as using the same docker image for scheduler/worker pods as the user image. Due to this, I think we merit from having our own docs page about this topic and not trying to upstream this or similar. - Use of dask to work against a dask cluster
This shouldn't be 2i2c specific, and ideally we settle with providing a basic example on doing test work. - More in depth technical notes on 2i2c's dask-gateway setup
We've made misc decisions that can influence users that should be documented somewhere.- dask-scheduler pod's memory/cpu resource requests
- use of cheaper pre-emptible nodes, and use of 16 CPU / 128GB machines currently
- ...
Example code to create a dask-gateway cluster
Here is code from input cells + screenshots of output cells in a jupyter notebook I use to test dask-gateway function as its setup in 2i2c hubs. This can be tested via https://dask-staging.2i2c.cloud.
# Create a gateway object to speak with dask-gateway,
# which in turn can create the dask cluster for you.
#
from dask_gateway import Gateway
gateway = Gateway()
# Request information about the options you can configure
# on a to-be-created dask cluster.
#
# All options are optional.
#
options = gateway.cluster_options()
options
# Now let's create a cluster. After running this cell, you get
# a control panel view to add/remote workers. Manually add at
# least one.
#
# If a new server needs to be started, it will take take ~5 minutes
# for it to register and update the numbers of workers.
#
cluster = gateway.new_cluster(options)
cluster
cluster.shutdown()
from docs.
EDIT: Ah I've just figured out that the 'Configurator' was set to quay.io/jupyter/scipy-notebook:2024-03-18
. Does this need to be changed back?
ORIGINAL POST:
The workflow above worked for me last week, but I am now having trouble reproducing this workflow on https://dask-staging.2i2c.cloud/.
I think the configs specify pangeo/pangeo-notebooks:latest
for the image to pull, however on the hub I get the quay.io/jupyter/scipy-notebook:2024-03-18
image, which does not have dask-gateway
installed.
jovyan@jupyter-jwong-402i2c-2eorg:~$ env | grep IMAGE
JUPYTER_IMAGE=quay.io/jupyter/scipy-notebook:2024-03-18
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE={JUPYTER_IMAGE_SPEC}
JUPYTER_IMAGE_SPEC=quay.io/jupyter/scipy-notebook:2024-03-18
Currently working around this by manually specifying a custom image on the Community Showcase Hub instead.
from docs.
@jnywong I reset the dask-staging hub to use pangeo/pangeo-notebook:latest again, I figure that makes sense for that hub to use! I think it could have been me that updated it to scipy and forgot to change it back, sorry for the trouble!
from docs.
Reference material for some technical context as to why we resource dask clusters the way we do: 2i2c-org/infrastructure#2687
from docs.
Sorry I've been unable to make progress as I have just only finished work on 2i2c-org/team-compass#859. This item on documenting dask-gateway will be committed to my next sprint today.
Thank you Erik for providing guidance on scope and context. I don't have a great deal of experience using dask-gateway, so these notes are appreciated.
from docs.
2024-05-09. @jnywong is serving as shepherd on this one. This will likely be addressed in the next sprint. This is not critical but needs to be written. This can be pulled into next iteration so that an associated FreshDesk ticket can be closed.
@choldgraf asks where will the documentation appear as SSOT? The mirrored documentation in the 2i2c site will be dropped in favor fo the docs site.
from docs.
Related Issues (20)
- Create a user off-boarding checklist for hub admins
- Document Grafana access for communities HOT 1
- [BUG] readthedocs actions doesn't provide a working URL to deploy preview when updating docs
- Upgrade Hub Service Guide to Jupyter Book HOT 3
- [EPIC] Port existing Hub Service Guide content to Jupyter Book HOT 8
- Document that the configurator is not available if using profileLists
- Write technical content for guiding communities on how to build custom images for their hubs. HOT 7
- Add how to edit users in admin/howto/manage-users/
- Directive to replace custom Python, e.g. list of running hubs and feature tables HOT 5
- [EPIC] Update workflows for the Hub Service Guide (docs.2i2c.org) to aid support work
- How-to guide for adding persistent storage buckets HOT 6
- How-to guide for tracking usage and costs in Grafana HOT 1
- Update how-to-guides/add-packages-to-image.md
- Document the usage of temp rather than $HOME for keeping temporary data files HOT 2
- Move https://github.com/yuvipanda/example-inherit-from-community-image to 2i2c-org and change into template HOT 2
- Document use of cloud object storage HOT 4
- Document how to test JupyterHub images locally
- Create redirect from old customise image docs to new customise image docs HOT 3
- Tutorial for data transfer workflow for large datasets
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docs.