Multi-user hub which spawns, manages, and proxies multiple workspace instances.
Highlights โข Getting Started โข Features & Screenshots โข Support โข Report a Bug โข Contribution
MLHub is based on Jupyterhub. MLHub allows to create and manage multiple workspaces, for example to distribute them to a group of people or within a team.
- ๐ซ Create, manage, and access Jupyter notebooks.
- ๐๏ธ Set configuration parameters such as CPU-limits for started workspaces.
- ๐ฅ Access additional tools within the started workspaces by having secured routes.
- ๐ Tunnel SSH connections to workspace containers.
- Docker
Most parts will be identical to the configuration of Jupyterhub 1.0.0. One of the things that are different is that ssl will not be activated on proxy or hub-level, but on our nginx proxy.
docker run \
-p 8091 \
--name mlhub \
-v /var/run/docker.sock:/var/run/docker.sock \
-v jupyterhub_data:/data \
ml-hub:latest
To persist the hub data, such as started workspaces and created users, mount a directory to /data
(-v
).
A name (--name
) should be set for the mlhub container, since we let the workspace container connect to the hub not via its docker id but its docker name. This way, the workspaces can still connect to the hub in case it was deleted and re-created (for example when updated).
For Kubernetes deployment, we forked and modified zero-to-jupyterhub-k8s which you can find here.
MLHub is based on SSH Proxy. Check out SSH Proxy for ssh-related configurations.
Variable | Description | Default |
---|---|---|
START_SSH | Start the sshd process which is used to tunnel ssh to the workspaces. | true |
START_NGINX | Whether or not to start the nginx proxy. If the Hub should be used without additional tool routing to workspaces, this could be disabled. SSH port 22 would need to be published separately then. This option is built-in to work with zero-to-mlhub-k8s | true |
START_JHUB | Start the Jupyterhub hub. This option is built-in to work with zero-to-mlhub-k8s, where the image is also used as the CHP image. | true |
START_CHP | Start the Jupyterhub proxy process separately (The hub should not start the proxy itself, which can be configured via the Jupyterhub config file. This option is built-in to work with zero-to-mlhub-k8s, where the image is also used as the CHP image. | false |
Jupyterhub itself is configured via a config.py
file. In case of MLHub, a default config file is stored under /resources/jupyterhub_config.py
. If you want to override settings or set extra ones, you can put another config file under /resources/jupyterhub_user_config.py
. Following settings should probably not be overriden:
c.Spawner.environment
- we set default variables there. Instead of overriding it, you can add extra variables to the existing dict, e.g. viac.Spawner.environment["myvar"] = "myvalue"
.c.DockerSpawner.prefix
andc.DockerSpawner.name_template
- if you change those, check whether your SSH environment variables permit those names a target. Also, think about settingc.Authenticator.username_pattern
to prevent a user having a username that is also a valid container name.- If you override ip and port connection settings, make sure to use Docker images that can handle those.
MLHub will automatically start with HTTPS. If you don't provide a certificate, it will generate one during startup. This is to make routing SSH connections possible as we use nginx to handle HTTPS & SSH on the same port.
Details (click to expand...)
If you have an own certificate, mount the certificate and key files as cert.crt
and cert.key
, respectively, as read-only at /resources/ssl
, so that the container has access to /resources/ssl/cert.crt
and /resources/ssl/cert.key
.
We override DockerSpawner and KubeSpawner for Docker and Kubernetes, respectively. We do so to add convenient labels and environment variables. Further, we return a custom option form to configure the resouces of the workspaces.
- We create a separate Docker network for each user, which means that (named) workspaces of the same user can see each other but workspaces of different users cannot see each other. Doing so adds another security layer in case a user starts a service within the own workspace and does not properly secure it.
- Create / delete services for a workspace, so that the hub can access them via Kubernetes DNS.
The ML Hub project is maintained by @raethlein and @LukasMasuch. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.
Type | Channel |
---|---|
๐จ Bug Reports | |
๐ Feature Requests | |
๐ฉโ๐ป Usage Questions | |
๐ฏ General Discussion |
WIP: Describe features with screenshots
- Pull requests are encouraged and always welcome. Read
CONTRIBUTING.md
and check out help-wanted issues. - Submit github issues for any feature enhancements, bugs, or documentation problems.
- By participating in this project you agree to abide by its Code of Conduct.
Licensed Apache 2.0. Created and maintained with โค๏ธ by developers from SAP in Berlin.