Code Monkey home page Code Monkey logo

jupyterlab's Introduction

JupyterLab for Data Science and Knowledge Graphs

Publish CPU images Publish GPU image

JupyterLab image with VisualStudio Code server integrated, based on the jupyter/docker-stacks scipy image, with additional packages and kernels installed for data science and knowledge graphs.

Screenshot

๐Ÿ”‹ Features

List of features for the images available running on CPU

ghcr.io/maastrichtu-ids/jupyterlab:latest

This is the base image with useful interfaces and libraries for data science preinstalled:

๐Ÿ“‹๏ธ VisualStudio Code server is installed, and accessible from the JupyterLab Launcher

๐Ÿ Python 3.8 with notebook kernel supporting autocomplete and suggestions (jupyterlab-lsp)

โ˜•๏ธ Java OpenJDK 11 with IJava notebook kernel

๐Ÿ Conda and mamba are installed, each conda environment created will add a new option to create a notebook using this environment in the JupyterLab Launcher (with nb_conda_kernels). You can create environments using different version of Python if necessary.

๐Ÿง‘โ€๐Ÿ’ป ZSH is used by default for the JupyterLab and VisualStudio Code terminals

The following JupyterLab extensions are also installed: jupyterlab-git, jupyterlab-system-monitor, jupyter_bokeh, plotly, jupyterlab-spreadsheet, jupyterlab-drawio.

ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graph

Extended from ghcr.io/maastrichtu-ids/jupyterlab:latest, it contains

โœจ๏ธ SPARQL kernel to query RDF knowledge graphs

โœจ๏ธ Apache Spark and PySpark are installed for distributed data processing

๐Ÿ’Ž OpenRefine is installed, and accessible from the JupyterLab Launcher

๐Ÿฆ€ Oxigraph SPARQL database

โšก๏ธ Blazegraph SPARQL database

โ˜•๏ธ Java .jar programs for knowledge graph processing are pre-downloaded in the /opt folder, such as RDF4J, Apache Jena, OWLAPI, RML mapper.

ghcr.io/maastrichtu-ids/jupyterlab:r-notebook

๐Ÿ“ˆ R kernel

Automatically install your code and dependencies

With those docker images, you can optionally provide the URL to a git repository to be automatically cloned in the workspace at the start of the container using the environment variable GIT_URL

The following files will be automatically installed if they are present at the root of the provided Git repository:

  • The conda environment described in environment.yml will be installed, make sure you added ipykernel and nb_conda_kernels to the environment.yml to be able to easily start notebooks using this environment from the JupyterLab Launcher page. See this repository as example.
  • The python packages in requirements.txt will be installed with pip
  • The debian packages in packages.txt will be installed with apt-get
  • The JupyterLab extensions in extensions.txt will be installed with jupyter labextension

You can also create a conda environment from a file in a running JupyterLab (we use mamba which is like conda but faster):

mamba env create -f environment.yml

You'll need to wait a minute before the new conda environment becomes available on the JupyterLab Launcher page.

๐Ÿ“ Extend a CPU image

The easiest way to build a custom image is to extend the existing images.

For notebooks running on CPU, we use images from the official jupyter/docker-stacks, which run as non root user. So you will need to make sure the folders permissions are properly set for the notebook user.

Here is an example Dockerfile to extend ghcr.io/maastrichtu-ids/jupyterlab:latest:

FROM ghcr.io/maastrichtu-ids/jupyterlab:latest
# Change to root user to install packages requiring admin privileges:
USER root
RUN apt-get update && \
    apt-get install -y vim
RUN fix-permissions /home/$NB_USER
# Switch back to the notebook user for other packages:
USER ${NB_UID}
RUN mamba install -c defaults -y rstudio
RUN pip install jupyter-rsession-proxy

For docker image that are not based on the jupyter/docker-stack, such as the GPU images, you the root user is used by default. See at the further in this README for more information on how to extend GPU images.

๐Ÿณ Run a CPU image with Docker

For the ghcr.io/maastrichtu-ids/jupyterlab:latest image volumes should be mounted into /home/jovyan/work folder.

This command will start JupyterLab as jovyan user with sudo privileges, use JUPYTER_TOKEN to define your password:

docker run --rm -it --user root -p 8888:8888 -e GRANT_SUDO=yes -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab

You should now be able to install anything in the JupyterLab container, try:

sudo apt-get update

You can check the docker-compose.yml file to run it easily with Docker Compose.

Run with a restricted jovyan user, without sudo privileges:

docker run --rm -it --user $(id -u) -p 8888:8888 -e CHOWN_HOME=yes -e CHOWN_HOME_OPTS='-R' -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab:latest

โš ๏ธ Potential permission issue when running locally. The official jupyter/docker-stacks images use the jovyan user by default which does not grant admin rights (sudo). This can cause issues when writing to the shared volumes, to fix it you can change the owner of the folder, or start JupyterLab as root user.

To create the folder with the right permissions, replace 1000:100 by your username:group if necessary and run:

mkdir -p data/
sudo chown -R 1000:100 data/

๐Ÿ“ฆ Build CPU images

Instructions to build the various image aiming to run on CPU.

JupyterLab for Data Science

This repository contains multiple folders with Dockerfile to build various flavor of JupyterLab for Data Science.

With Python 3.8, conda integration, VisualStudio Code, Java and SPARQL kernels

Build:

docker build -t ghcr.io/maastrichtu-ids/jupyterlab .

Run:

docker run --rm -it --user root -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab

Push:

docker push ghcr.io/maastrichtu-ids/jupyterlab

JupyterLab for Knowledge graph

With Oxigraph and Blazegraph SPARQL database, and additional python/java library for RDF processing:

docker build -f knowledge-graph/Dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graph .
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graph

Python 2.7

With a python2.7 kernel only (python3 not installed). Build and run (workdir is /root):

docker build -t ghcr.io/maastrichtu-ids/jupyterlab:python2.7 ./python2.7
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:python2.7

Ricopili

Based on https://github.com/bruggerk/ricopili_docker. Build and run (workdir is /root):

docker build -t ghcr.io/maastrichtu-ids/jupyterlab:ricopili ./ricopili
docker run --rm -it -p 8888:8888 -v $(pwd)/data:/root -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:ricopili

FSL on CPU

Built with https://github.com/ReproNim/neurodocker. Build and run (workdir is /root):

docker build -t ghcr.io/maastrichtu-ids/jupyterlab:fsl ./fsl
docker run --rm -it -p 8888:8888 -v $(pwd)/data:/root -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:fsl

โšก๏ธJupyterLab on GPU

To deploy JupyterLab on GPU we use the official Nvidia images, we defined the same gpu.dockerfile to install additional dependencies, such as JupyterLab and VisualStudio Code, with different images from Nvidia:

๐Ÿ—œ๏ธ TensorFlow with nvcr.io/nvidia/tensorflow:

  • ghcr.io/maastrichtu-ids/jupyterlab:tensorflow

๐Ÿ”ฅ PyTorch with nvcr.io/nvidia/pytorch:

  • ghcr.io/maastrichtu-ids/jupyterlab:pytorch

๐Ÿ‘๏ธ CUDA with nvcr.io/nvidia/cuda:

  • ghcr.io/maastrichtu-ids/jupyterlab:cuda

Volumes should be mounted into the /workspace/persistent or /workspace folder.

๐Ÿ“ Extend a GPU image

The easiest way to build a custom image is to extend the existing images.

Here is an example Dockerfile to extend ghcr.io/maastrichtu-ids/jupyterlab:tensorflow based on nvcr.io/nvidia/tensorflow:

FROM ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
RUN apt-get update && \
    apt-get install -y vim
RUN pip install jupyter-tensorboard

๐Ÿ“ฆ Build GPU images

You will find here the commands to use to build our different GPU docker images, most of them are using the gpu.dockerfile

Tensorflow on GPU

Change the build-arg and run from the root folder of this repository:

docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/tensorflow:21.11-tf2-py3 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:tensorflow .

Run an image on http://localhost:8888

docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:tensorflow

CUDA on GPU

Change the build-arg and run from the root folder of this repository:

docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/cuda:11.4.2-devel-ubuntu20.04 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:tensorflow .

Run an image on http://localhost:8888

docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:cuda

PyTorch on GPU

Change the build-arg and run from the root folder of this repository:

docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/pytorch:21.11-py3 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:pytorch .

Run an image on http://localhost:8888

docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:pytorch

FSL on GPU

This build use a different image, go to the fsl-gpu folder. And check the README.md for more details.

Build:

docker build -t ghcr.io/maastrichtu-ids/jupyterlab:fsl-gpu ./fsl-gpu

Run (workdir is /workspace):

docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:fsl-gpu

โ˜๏ธ Deploy on Kubernetes and OpenShift

This image is compatible with OpenShift and OKD security constraints to run as non root user.

We recommend to use this Helm chart to deploy these JupyterLab images on Kubernetes or OpenShift: https://artifacthub.io/packages/helm/dsri-helm-charts/jupyterlab

If you are working or studying at Maastricht University, you can easily deploy this notebook on the Data Science Research Infrastructure (DSRI) ๐ŸŒ‰

๐Ÿ•Š๏ธ Contribute to this repository

Choose which image fits your need: latest, tensorflow, cuda, pytorch, freesurfer, python2.7...

  1. Fork this repository.

  2. Clone the forked repository

  3. Edit the Dockerfile for the image you want to improve. Preferably use mamba or conda to install new packages, you can also install with apt-get (need to run as root or with sudo) and pip

  4. Go to the folder and rebuild the Dockerfile:

docker build -t jupyterlab -f Dockerfile .
  1. Run the docker image built on http://localhost:8888 to test it
docker run -it --rm -p 8888:8888 -e JUPYTER_TOKEN=yourpassword jupyterlab

If the built Docker image works well feel free to send a pull request to get your changes merged to the main repository and integrated in the corresponding published Docker image.

You can check the size of the image built in MB:

expr $(docker image inspect ghcr.io/maastrichtu-ids/jupyterlab:latest --format='{{.Size}}') / 1000000

jupyterlab's People

Contributors

vemonet avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.