Code Monkey home page Code Monkey logo

ml-hub's Introduction

ML Hub

Multi-user hub which spawns, manages, and proxies multiple workspace instances.

Highlights โ€ข Getting Started โ€ข Features & Screenshots โ€ข Support โ€ข Report a Bug โ€ข Contribution

MLHub is based on Jupyterhub. MLHub allows to create and manage multiple workspaces, for example to distribute them to a group of people or within a team.

Highlights

  • ๐Ÿ’ซ Create, manage, and access Jupyter notebooks.
  • ๐Ÿ–Š๏ธ Set configuration parameters such as CPU-limits for started workspaces.
  • ๐Ÿ–ฅ Access additional tools within the started workspaces by having secured routes.
  • ๐ŸŽ› Tunnel SSH connections to workspace containers.

Getting Started

Prerequisites

  • Docker

Most parts will be identical to the configuration of Jupyterhub 1.0.0. One of the things that are different is that ssl will not be activated on proxy or hub-level, but on our nginx proxy.

Start an instance via Docker

docker run \
    -p 8091 \
    --name mlhub \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v jupyterhub_data:/data \
    ml-hub:latest

To persist the hub data, such as started workspaces and created users, mount a directory to /data (-v). A name (--name) should be set for the mlhub container, since we let the workspace container connect to the hub not via its docker id but its docker name. This way, the workspaces can still connect to the hub in case it was deleted and re-created (for example when updated).

For Kubernetes deployment, we forked and modified zero-to-jupyterhub-k8s which you can find here.

Configuration

Environment Variables

MLHub is based on SSH Proxy. Check out SSH Proxy for ssh-related configurations.

Variable Description Default
START_SSH Start the sshd process which is used to tunnel ssh to the workspaces. true
START_NGINX Whether or not to start the nginx proxy. If the Hub should be used without additional tool routing to workspaces, this could be disabled. SSH port 22 would need to be published separately then. This option is built-in to work with zero-to-mlhub-k8s true
START_JHUB Start the Jupyterhub hub. This option is built-in to work with zero-to-mlhub-k8s, where the image is also used as the CHP image. true
START_CHP Start the Jupyterhub proxy process separately (The hub should not start the proxy itself, which can be configured via the Jupyterhub config file. This option is built-in to work with zero-to-mlhub-k8s, where the image is also used as the CHP image. false

Jupyterhub Config

Jupyterhub itself is configured via a config.py file. In case of MLHub, a default config file is stored under /resources/jupyterhub_config.py. If you want to override settings or set extra ones, you can put another config file under /resources/jupyterhub_user_config.py. Following settings should probably not be overriden:

  • c.Spawner.environment - we set default variables there. Instead of overriding it, you can add extra variables to the existing dict, e.g. via c.Spawner.environment["myvar"] = "myvalue".
  • c.DockerSpawner.prefix and c.DockerSpawner.name_template - if you change those, check whether your SSH environment variables permit those names a target. Also, think about setting c.Authenticator.username_pattern to prevent a user having a username that is also a valid container name.
  • If you override ip and port connection settings, make sure to use Docker images that can handle those.

Enable SSL/HTTPS

MLHub will automatically start with HTTPS. If you don't provide a certificate, it will generate one during startup. This is to make routing SSH connections possible as we use nginx to handle HTTPS & SSH on the same port.

Details (click to expand...)

If you have an own certificate, mount the certificate and key files as cert.crt and cert.key, respectively, as read-only at /resources/ssl, so that the container has access to /resources/ssl/cert.crt and /resources/ssl/cert.key.

Spawner

We override DockerSpawner and KubeSpawner for Docker and Kubernetes, respectively. We do so to add convenient labels and environment variables. Further, we return a custom option form to configure the resouces of the workspaces.

DockerSpawner

  • We create a separate Docker network for each user, which means that (named) workspaces of the same user can see each other but workspaces of different users cannot see each other. Doing so adds another security layer in case a user starts a service within the own workspace and does not properly secure it.

KubeSpawner

  • Create / delete services for a workspace, so that the hub can access them via Kubernetes DNS.

Support

The ML Hub project is maintained by @raethlein and @LukasMasuch. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

Type Channel
๐Ÿšจ Bug Reports
๐ŸŽ Feature Requests
๐Ÿ‘ฉโ€๐Ÿ’ป Usage Questions
๐Ÿ—ฏ General Discussion

Features

WIP: Describe features with screenshots

Contribution


Licensed Apache 2.0. Created and maintained with โค๏ธ by developers from SAP in Berlin.

ml-hub's People

Contributors

lukasmasuch avatar raethlein avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.