Code Monkey home page Code Monkey logo

infuseai / primehub Goto Github PK

View Code? Open in Web Editor NEW
380.0 24.0 36.0 5.74 MB

open-source MLOps platform

Home Page: https://docs.primehub.io

License: Apache License 2.0

Python 26.34% Shell 27.37% Smarty 0.05% Makefile 0.52% Jsonnet 2.73% NASL 2.10% CSS 1.90% JavaScript 11.80% HTML 5.88% Gherkin 18.09% Mustache 2.81% Dockerfile 0.17% HCL 0.21%
jupyter jupyterhub keycloak kubernetes machine-learning data-science docker primehub-ce primehub distributed-systems

primehub's Introduction

logo

GitHub release FOSSA Status CircleCI codecov InfuseAI Discord Invite Open In Colab

PrimeHub Community Edition

Welcome to the PrimeHub Community Edition repository, PrimeHub is an effortless infrastructure for machine learning built on the top of Kubernetes. It provides cluster-computing, one-click research environments, easy dataset loading, and management of various resources and access-control. All of these are designed from a project/team-centric concept.

In terms of PrimeHub CE, it provides a few fundamental features from Enterprise Edition↗.

To IT leaders, PrimeHub gives flexibility and administration authority to configure resources and settings for their teams, as well as to pave the way and manage productionized workloads.

To Data scientists, PrimeHub provides Jupyter Notebook-ready environment which is just few-clicks away.

This community repository contains a Helm Chart for PrimeHub CE and a guide on how to install PrimeHub CE with Helm.

AWS launch links

Edition Launch link
Community Edition Launch Stack
Enterprise Edition Launch Stack

Fundamental Features

  • Opinionated JupyterHub distribution
  • Group & user based resource management
  • Instance, image & secret management
  • Support different types of dataset
  • Dataset uploader
  • SSH server (allow access into JupyterHub via ssh remotely)

What makes PrimeHub different

Please see the comparison.

Installation

Please see the installation guide↗.

Contributions

We welcome contributions. See the Set up dev environment and the Contributing guildline to get started.

Project Status

PrimeHub CE is released alongside PrimeHub EE. The project has been developed steadily. We keep improving PrimeHub's robustness, enhancing user experience and are releasing more features with the community. Suggestions and discussions are always welcome and appreciated.

Documentation

Designs & Concepts

PrimeHub is built on top of well-designed distributed systems. We use Kubernetes as the orchestration platform and utilize its resource management and fault-tolerance abilities.

You can read more about the designs & concepts of PrimeHub ↗ or visit our documentation↗ site to learn more about PrimeHub.

primehub's People

Contributors

aar0ntw avatar chanyilin avatar clkao avatar ctiml avatar daveflynn avatar even-wei avatar gblpedia avatar ggosiang avatar hlb avatar hsatac avatar huaichehuang avatar jackklpan avatar jimmyliao avatar kentwelcome avatar kristenwei avatar liuyuwei avatar neighborhood999 avatar popcornylu avatar qrtt1 avatar raymondnguyen8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

primehub's Issues

logout not working

Reproduce step

  • login
  • logout

Result

The user is not actually logged out

Expected

The user should go to login page.

Idempotent bootstrap (with service account)

Introduction

when helm upgrade, the bootstrap script should update client setting idempotently.

Tasks

  • add the primehub client that play the admin-ui role in ce
  • bootstrap with service account
  • primehub should enable service account with realm-management client role
  • check the behavior of keycloak mapper update api for idempotent (e.g. adding timezone mapper)

Testing Scenario

  1. Install
  2. Upgrade
  3. Upgrade again
  4. add new protomapper (e.g. timezone) and upgrade again.

[Bug] one-click version of the README badge differs from https://one.primehub.io/

What happened:
The link of the badge: v1.1.3 starter
The link from the official site: v.1.1.4 https://one.primehub.io/

What you expected to happen:

  • By using the one-click badge I can build a stack that allows me to deploy at least one model.
  • By using the one-click badge I can select ee on building the stack.

How to reproduce it (as minimally and precisely as possible):

  • Click the badge from this repository and the official site.
  • After building the stack, go to the Deployment tab of your PrimeHub console.

Anything else we need to know?:
Actual result: v1.1.3 (the badge of this repository) only provides ce and (of course) does not allow us to deploy models.

warning: cannot overwrite table with non table for extraEnv

Problem

When installing, these warning message pop out

2019/06/20 18:49:49 warning: cannot overwrite table with non table for extraEnv (map[])
2019/06/20 18:49:49 warning: cannot overwrite table with non table for extraEnv (map[])
2019/06/20 18:49:49 warning: cannot overwrite table with non table for extraEnv (map[])
2019/06/20 18:49:49 warning: cannot overwrite table with non table for extraEnv (map[])

Possible Solution

Change the jupyterhub.hub.extraEnv from array to map

https://github.com/InfuseAI/primehub/blob/master/helm/primehub/values.yaml#L20-L37

Reference

https://zero-to-jupyterhub.readthedocs.io/en/latest/reference.html#hub-extraenv

[Bug] Install CE script freezes

What happened:

I ran into Problems with the PrimeHub CE install script

What you expected to happen:

That the install script runs smoothly

How to reproduce it (as minimally and precisely as possible):

sudo ufw disable
sudo iptables --policy INPUT ACCEPT
sudo iptables --policy FORWARD ACCEPT

curl -O https://storage.googleapis.com/primehub-release/bin/primehub-install
chmod +x primehub-install
./primehub-install create singlenode
./primehub-install create primehub --primehub-version v3.5.2 --primehub-ce --helm-timeout 20m --enable-https

Anything else we need to know?:

Output:

[Search] Folder primehub-v3.5.2
[Not Found] Folder primehub-v3.5.2
[Search] tarball primehub-v3.5.2.tar.gz
[Not Found] tarball primehub-v3.5.2.tar.gz
[Search] primehub helm chart with version: v3.5.2
[Preflight Check]
[Preflight Check] Pass
[Prepare] PrimeHub require values
Please enter PRIMEHUB_DOMAIN: XXXX (I took the right domain here, this can't be the problem...)
Please enter KC_PASSWORD: XXXX
Please enter PH_PASSWORD: XXXX
[Init] primehub config
[Create] /home/mhoellmann/.primehub/config/microk8s/.env
[Verify] Domain Name: https://primehub.ni.dfki.de/healthz

[Check] Cert Manager
[Install] Cert Manager
"jetstack" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ingress-nginx" chart repository
...Successfully got an update from the "primehub" chart repository
...Successfully got an update from the "jetstack" chart repository
...Successfully got an update from the "infuseai" chart repository
Update Complete. ⎈Happy Helming!⎈
NAME: cert-manager
LAST DEPLOYED: Tue May 25 09:35:24 2021
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager has been deployed successfully!

In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).

More information on the different types of issuers and how to configure them
can be found in our documentation:

https://cert-manager.io/docs/configuration/

For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the ingress-shim
documentation:

https://cert-manager.io/docs/usage/ingress/
Waiting for deployment "cert-manager-webhook" rollout to finish: 0 of 1 updated replicas are available...
deployment "cert-manager-webhook" successfully rolled out
No resources found in default namespace.
[Apply] Cluster Issuer: letsencrypt-prod
clusterissuer.cert-manager.io/letsencrypt-prod created
[Install] PrimeHub
[Check] primehub.yaml
[Generate] primehub.yaml for CE
[Install] PrimeHub
"infuseai" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "primehub" chart repository
...Successfully got an update from the "ingress-nginx" chart repository
...Successfully got an update from the "jetstack" chart repository
...Successfully got an update from the "infuseai" chart repository
Update Complete. ⎈Happy Helming!⎈
Release "primehub" does not exist. Installing it now.
coalesce.go:196: warning: cannot overwrite table with non table for extraEnv (map[])

Output of

Every 2.0s: kubectl get pod -n hub primehub: Tue May 25 10:01:45 2021
NAME READY STATUS RESTARTS AGE
csi-controller-rclone-0 3/3 Running 0 26m
csi-nodeplugin-rclone-qh862 2/2 Running 0 26m
hub-64cc75f4b5-vxcmn 0/1 CreateContainerConfigError 0 26m
keycloak-0 0/1 Init:0/2 0 26m
keycloak-postgres-0 0/1 Pending 0 26m
metacontroller-0 1/1 Running 0 26m
primehub-admission-655bdb9f75-8pcnm 1/1 Running 0 26m
primehub-bootstrap-j55kx 1/1 Running 0 26m
primehub-console-d94676478-px2t9 0/1 CreateContainerConfigError 0 26m
primehub-controller-b7d878c54-q6trx 2/2 Running 0 26m
primehub-fluentd-q66p6 1/1 Running 0 26m
primehub-graphql-657b7478ff-5djrl 0/1 CreateContainerConfigError 0 26m
primehub-metacontroller-webhook-7597f68459-lx9pc 1/1 Running 0 26m
primehub-minio-0 1/1 Running 0 26m
primehub-shared-space-tusd-5c9cd99d98-psx9h 1/1 Running 0 26m
primehub-watcher-5b4c84f5b9-fdg4g 0/1 CreateContainerConfigError 0 26m
proxy-6bb567848c-sttzb 1/1 Running 0 26m

kubectl logs -n hub $(kubectl get pod -n hub | grep primehub-bootstrap | cut -d' ' -f1) -f

repeats
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
[xxx] http://keycloak-http.hub/auth
...

firewall was disabled so this can't be the problem

Environment:

  • PrimeHub version (use helm ls):

gives only

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION

  • Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.11", GitCommit:"ea5f00d93211b7c80247bf607cfa422ad6fb5347", GitTreeState:"clean", BuildDate:"2020-08-13T15:20:25Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.17", GitCommit:"f3abc15296f3a3f54e4ee42e830c61047b13895f", GitTreeState:"clean", BuildDate:"2021-01-13T13:13:00Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

  • Cloud provider or hardware configuration:

self hosted virtual machine

  • OS (e.g: cat /etc/os-release):

NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

  • Kernel (e.g. uname -a):

Linux primehub 5.4.0-73-generic #82-Ubuntu SMP Wed Apr 14 17:39:42 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Require installation guide for gke

Require a document about how to install primehub on top of gke.

  1. Create a cluster
  2. Setup ingress
  3. Setup public ip
  4. Install helm
  5. Install PrimeHub

support for k8s v1.22 v1.23

What would you like to be added: support for k8s version v1.22 v1.23

Why is this needed: Error: failed to install CRD crds/metacontroller/crd.yaml: unable to recognize "": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"

stay in page "Your server is starting up. You will be redirected automatically when it's ready for you"

when i put start my server
it stay in page
" Your server is starting up.
You will be redirected automatically when it's ready for you. "

kubectl logs hub-865cf58d65-9hgtc -c hub -n primehub

[D 2019-06-10 05:52:00.550 JupyterHub reflector:202] Connecting pods watcher
[D 2019-06-10 05:52:01.548 JupyterHub reflector:263] events watcher timeout
[D 2019-06-10 05:52:01.548 JupyterHub reflector:202] Connecting events watcher
[D 2019-06-10 05:52:10.557 JupyterHub reflector:263] pods watcher timeout
[D 2019-06-10 05:52:10.557 JupyterHub reflector:202] Connecting pods watcher
[D 2019-06-10 05:52:11.558 JupyterHub reflector:263] events watcher timeout
[D 2019-06-10 05:52:11.558 JupyterHub reflector:202] Connecting events watcher
[D 2019-06-10 05:52:20.565 JupyterHub reflector:263] pods watcher timeout
[D 2019-06-10 05:52:20.565 JupyterHub reflector:202] Connecting pods watcher
[D 2019-06-10 05:52:21.568 JupyterHub reflector:263] events watcher timeout
[D 2019-06-10 05:52:21.568 JupyterHub reflector:202] Connecting events watcher
[D 2019-06-10 05:52:30.572 JupyterHub reflector:263] pods watcher timeout
[D 2019-06-10 05:52:30.572 JupyterHub reflector:202] Connecting pods watcher
[D 2019-06-10 05:52:31.578 JupyterHub reflector:263] events watcher timeout
[D 2019-06-10 05:52:31.578 JupyterHub reflector:202] Connecting events watcher
[D 2019-06-10 05:52:32.825 JupyterHub proxy:765] Proxy: Fetching GET http://10.233.50.202:8001/api/routes
[I 2019-06-10 05:52:32.827 JupyterHub proxy:319] Checking routes

Update Prometheus and Grafana helm chart version inside PrimeHub

What would you like to be added:

  • Update Prometheus and Grafana helm chart versions.
  • Use kube-prometheus-stack to replace the old prometheus-operator.

Why is this needed:

  • Currently we use prometheus-operator to install Prometheus and Grafana, however it is outdated.
  • Because prometheus-operator is renamed to kube-prometheus-stack, we need to update it to meet the current status.

Job Scheduling: notifications

When a job is finished (Succeeded or Failed), the user can receive a notification by email or webhook.

(optional: group admin can receive the notification)

rename chart directory

The directory name of the helm chart should be same with the name written in the Chart.yaml

$ helm package helm
Error: directory name (helm) and Chart.yaml name (primehub) must match

How to Access the Portal Interface After I have installed all

hub hub-7986bdc7-ddgsg 1/1 Running 0 49m
hub keycloak-0 1/1 Running 1 49m
hub keycloak-postgres-0 1/1 Running 0 49m
hub metacontroller-0 1/1 Running 0 49m
hub primehub-admission-77ffb567ff-kzpgr 1/1 Running 0 49m
hub primehub-bootstrap-wmt4g 0/1 Completed 0 49m
hub primehub-console-755b6c4d49-whbc4 1/1 Running 0 49m
hub primehub-controller-747dddc4d6-72tb9 2/2 Running 0 49m
hub primehub-graphql-79dbcb47ff-8ck25 1/1 Running 3 49m
hub primehub-metacontroller-webhook-7ffb6dffb-lgs6r 1/1 Running 0 49m
hub primehub-watcher-7d789bd6b6-45twj 1/1 Running 0 49m
hub proxy-6677bcf9cf-5j6nv 1/1 Running 0 49m
ingress-nginx ingress-nginx-admission-create-q8tdj 0/1 Completed 0 39h
ingress-nginx ingress-nginx-admission-patch-nj6tf 0/1 Completed 2 39h
ingress-nginx ingress-nginx-controller-dl5gr 1/1 Running 1 39h
jupyter continuous-image-puller-m4vbk 1/1 Running 1 23h
jupyter hub-74dfcd4b48-6b5p8 1/1 Running 1 23h
jupyter proxy-74487c9bd-r98qq 1/1 Running 1 23h
jupyter user-scheduler-7b4594f97c-dj8ts 1/1 Running 2 23h
jupyter user-scheduler-7b4594f97c-mnxsh 1/1 Running 1 23h
kube-system coredns-57d4cbf879-92gdw 1/1 Running 1 47h
kube-system coredns-57d4cbf879-jz5wh 1/1 Running 1 26h
kube-system etcd-k8s-master 1/1 Running 3 47h
kube-system kube-apiserver-k8s-master 1/1 Running 5 47h
kube-system kube-controller-manager-k8s-master 1/1 Running 2 25h
kube-system kube-proxy-b7zgh 1/1 Running 0 47h
kube-system kube-scheduler-k8s-master 1/1 Running 2 25h
kuboard kuboard-agent-2-84fb67bf67-6pb58 1/1 Running 4 43h
kuboard kuboard-agent-7fb4dffc4f-rhnl6 1/1 Running 1 43h
kuboard kuboard-agent-mi7gb5-2-78b8b4846-l6zw6 1/1 Running 3 43h
kuboard kuboard-agent-mi7gb5-5b58c6d688-2xq9k 1/1 Running 4 43h
kuboard kuboard-etcd-vkbcq 1/1 Running 0 43h
kuboard kuboard-questdb-54c5bb79b8-mjtdp 1/1 Running 1 43h
kuboard kuboard-v3-5fc46b5557-7vrxv 1/1 Running 1 43h

setup CI to publish charts

todos

  • create a repo to store Helm chart tarballs (helm-chart)
  • configure CI to publish charts

circle-ci

  • add ssh key to project
  • verfiy watched branches
  • update config.yml
  • update key path in script (or replace it to env-var)
  • review git author

branch to watch

branches:
only:
- master

config.yml

steps:
- add_ssh_keys:
fingerprints:
- "0b:b8:8d:e6:cc:04:d9:6b:a9:57:aa:9c:5c:9d:07:fc"

key path

primehub/ci/publish.sh

Lines 9 to 12 in 0c0cc68

# it is registered into both circlie-ci and github
# by the way, circlie-ci only accept PEM private key format
# it should be generated by "ssh-keygen -m pem"
CI_PUBLISH_KEY=~/.ssh/id_rsa_0bb88de6cc04d96ba957aa9c5c9d07fc

review commit author

primehub/ci/common.sh

Lines 60 to 62 in 0c0cc68

# configure publish author
git config --global user.email "[email protected]"
git config --global user.name "circle-ci"

https://github.com/InfuseAI/helm-chart

need to be confirmed

  • which branch should ci to watch ?
    • each changes in master
    • or use a release branch

Better way to inject jupyter_primehub.py

Currently we need users to run helm with --set-file to inject jupyter_primehub.py.

See if subPath works for injecting jupyter_primehub.py.

Another option is to package the authenticator class

secret "primehub-client-admin-ui" or "primehub-client-jupyterhub" not found

What happened:
I followed the steps given on the official website, and set the following variables:
PRIMEHUB_DOMAIN=test.corp.com.cn
PRIMEHUB_PASSWORD=password
KEYCLOAK_DOMAIN=test.corp.com.cn
KEYCLOAK_PASSWORD=password
STORAGE_CLASS=k8s-rbd
GRAPHQL_SECRET_KEY=$(openssl rand -hex 32)s
HUB_AUTH_STATE_CRYPTO_KEY=$(openssl rand -hex 32)
HUB_PROXY_SECRET_TOKEN=$(openssl rand -hex 32)

and then install it using helm:
helm upgrade
--install
--reset-values
--namespace hub
--values primehub-values.yaml
--timeout 3000
primehub infuseai/primehub

however, it reported errors after installation

kubectl get all -n hub

NAME READY STATUS RESTARTS AGE
pod/admission-post-install-job-d77m5 0/1 Completed 0 4m25s
pod/hub-6d56565db7-rq557 0/1 CreateContainerConfigError 0 4m28s
pod/pod-image-replacing-webhook-b64ff775b-jdcs4 1/1 Running 0 4m28s
pod/primehub-bootstrap-4h8tz 1/1 Running 0 4m19s
pod/primehub-console-7865c5cb89-r4tpd 0/1 CreateContainerConfigError 0 4m27s
pod/primehub-graphql-796596c56f-tj5sl 0/1 CreateContainerConfigError 0 4m28s
pod/primehub-group-7c4669f57f-8tqww 1/1 Running 0 4m28s
pod/primehub-watcher-fbd4cbc96-wwdxc 0/1 CreateContainerConfigError 0 4m28s
pod/proxy-6dfd9b946-fgvt2 1/1 Running 0 4m28s
pod/resources-validation-webhook-67755979f7-jrz22 1/1 Running 0 4m27s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/hub ClusterIP 10.43.196.87 8081/TCP 4m28s
service/pod-image-replacing-webhook-svc ClusterIP 10.43.235.106 443/TCP 4m28s
service/primehub-console ClusterIP 10.43.11.61 80/TCP 4m28s
service/primehub-graphql ClusterIP 10.43.129.79 80/TCP 4m28s
service/primehub-group ClusterIP 10.43.45.241 80/TCP 4m28s
service/proxy-api ClusterIP 10.43.191.100 8001/TCP 4m28s
service/proxy-public ClusterIP 10.43.195.14 80/TCP 4m28s
service/resources-validation-webhook-svc ClusterIP 10.43.121.126 443/TCP 4m28s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/hub 0/1 1 0 4m28s
deployment.apps/pod-image-replacing-webhook 1/1 1 1 4m28s
deployment.apps/primehub-console 0/1 1 0 4m28s
deployment.apps/primehub-graphql 0/1 1 0 4m28s
deployment.apps/primehub-group 1/1 1 1 4m28s
deployment.apps/primehub-watcher 0/1 1 0 4m28s
deployment.apps/proxy 1/1 1 1 4m28s
deployment.apps/resources-validation-webhook 1/1 1 1 4m28s

NAME DESIRED CURRENT READY AGE
replicaset.apps/hub-6d56565db7 1 1 0 4m28s
replicaset.apps/pod-image-replacing-webhook-b64ff775b 1 1 1 4m28s
replicaset.apps/primehub-console-7865c5cb89 1 1 0 4m27s
replicaset.apps/primehub-graphql-796596c56f 1 1 0 4m28s
replicaset.apps/primehub-group-7c4669f57f 1 1 1 4m28s
replicaset.apps/primehub-watcher-fbd4cbc96 1 1 0 4m28s
replicaset.apps/proxy-6dfd9b946 1 1 1 4m28s
replicaset.apps/resources-validation-webhook-67755979f7 1 1 1 4m27s

NAME READY AGE
statefulset.apps/user-placeholder 0/0 4m28s

NAME COMPLETIONS DURATION AGE
job.batch/admission-post-install-job 1/1 4s 4m25s
job.batch/primehub-bootstrap 0/1 4m19s 4m19s

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

kubectl describe pod primehub-watcher-fbd4cbc96-wwdxc -n hub, it returned that :
Events:
Type Reason Age From Message


Normal Scheduled 7m default-scheduler Successfully assigned hub/primehub-watcher-fbd4cbc96-wwdxc to shoya15plpegpu087
Normal Pulling 7m kubelet, shoya15plpegpu087 Pulling image "infuseai/primehub-console-watcher:bc0adf0"
Normal Pulled 7m kubelet, shoya15plpegpu087 Successfully pulled image "infuseai/primehub-console-watcher:bc0adf0"
Warning Failed 5m (x12 over 7m) kubelet, shoya15plpegpu087 Error: secret "primehub-client-admin-ui" not found
Normal Pulled 2m (x24 over 7m) kubelet, shoya15plpegpu087 Container image "infuseai/primehub-console-watcher:bc0adf0" already present on machine

or returned like this:
Error: secret "primehub-client-jupyterhub" not found

Environment:

  • PrimeHub version (use helm ls): 2.7
  • Kubernetes version (use kubectl version): 1.15

I want to know if there is something wrong with my installation steps?

[Bug] "nginx-ingress-ingress-nginx-controller" exceeded its progress deadline

What happened:
can't install primehub CE (single node)
i have encountered this problem:

error: deployment "nginx-ingress-ingress-nginx-controller" exceeded its progress deadline

What you expected to happen:
see something like [Require Action] Please relogin this session and run create singlenode again

How to reproduce it (as minimally and precisely as possible):

curl -O https://storage.googleapis.com/primehub-release/bin/primehub-install
chmod +x primehub-install
./primehub-install create singlenode

Anything else we need to know?:

Environment: ubuntu20.04 (in vmware)

  • PrimeHub version (use helm ls):
  • Kubernetes version (use kubectl version):
{
  "clientVersion": {
    "major": "1",
    "minor": "21",
    "gitVersion": "v1.21.2",
    "gitCommit": "092fbfbf53427de67cac1e9fa54aaa09a28371d7",
    "gitTreeState": "clean",
    "buildDate": "2021-06-16T12:59:11Z",
    "goVersion": "go1.16.5",
    "compiler": "gc",
    "platform": "linux/amd64"
  },
  "serverVersion": {
    "major": "1",
    "minor": "20+",
    "gitVersion": "v1.20.7-34+984a1cd176537e",
    "gitCommit": "984a1cd176537e2d36d7db36e49e787e1c42a0aa",
    "gitTreeState": "clean",
    "buildDate": "2021-06-11T05:08:05Z",
    "goVersion": "go1.15.13",
    "compiler": "gc",
    "platform": "linux/amd64"
  }
}
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
  • Kernel (e.g. uname -a):
    Linux ub 5.4.0-74-generic #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Design the notification mechanism

This is a product design ticket.

Types of notifications:

  • (user level) Job is finished (succeeded or failed) #690
  • (group level) Group member is changed (join or leave)
  • (admin) system alert
  • (admin) resource threshold

Types of integrations:

  • Email
  • Webhook

[PH-27] PrimeHub Job email notifications

  • Objectives
    • keep users informed about the job's final state via email notifications
  • Results
    • admin can configure email notifications for group
    • user can receive email notification
  • Flow:
    • Create Job -> Click if the user wants to get the email notifications -> Fill in the receivers' email -> Job start -> Job end -> notify
  • Notification content:

Kubernetes 1.22+ support

Kubernetes 1.19 ~ 1.21 are supported in PrimeHub v3.x. Since Kubernetes 1.21 is EOL, in PrimeHub v4 we will support Kubernetes 1.21 ~ 1.24+

[Bug] Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: unknown object type "nil" in Secret.data.kcpassword

What happened:
Following the installation guide and it seems that there is no manifest in helm repository

Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: unknown object type "nil" in Secret.data.kcpassword

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Prepare the primehub-values.yaml

$ helm repo add infuseai https://charts.infuseai.io
$ helm update
$ helm upgrade \
  primehub infuseai/primehub \
  --install \
  --create-namespace \
  --namespace hub  \
  --values primehub-values.yaml

coalesce.go:196: warning: cannot overwrite table with non table for extraEnv (map[])
Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: unknown object type "nil" in Secret.data.kcpassword

Anything else we need to know?:

Environment:

  • PrimeHub version (use helm ls):
$ helm ls
NAME	NAMESPACE	REVISION	UPDATED	STATUS	CHART	APP VERSION
  • Helm version
$ helm version
version.BuildInfo{Version:"v3.2.3+4.el8", GitCommit:"2160a65177049990d1b76efc67cb1a9fd21909b1", GitTreeState:"clean", GoVersion:"go1.13.4"}
  • Kubernetes version (use kubectl version):
$ oc version
Client Version: openshift-clients-4.4.0-202005231254
Server Version: 4.3.13
Kubernetes Version: v1.16.2
  • Cloud provider or hardware configuration:
OpenShift 4.3 on AWS 
  • OS (e.g: cat /etc/os-release):
sh-4.4# cat /etc/os-release
NAME="Red Hat Enterprise Linux CoreOS"
VERSION="43.81.202004130853.0"
VERSION_ID="4.3"
OPENSHIFT_VERSION="4.3"
RHEL_VERSION=8.0
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 43.81.202004130853.0 (Ootpa)"
ID="rhcos"
ID_LIKE="rhel fedora"
ANSI_COLOR="0;31"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
REDHAT_BUGZILLA_PRODUCT_VERSION="4.3"
REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
REDHAT_SUPPORT_PRODUCT_VERSION="4.3"
OSTREE_VERSION='43.81.202004130853.0'
  • Kernel (e.g. uname -a):
sh-4.4# uname -a
Linux ip-10-0-141-73 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Wed Feb 26 03:08:15 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

not connect service/proxy-public

i install in on-premise k8s cluster

when i run
"helm upgrade --install --namespace primehub primehub primehub --values config-kind.yaml --values config-persist.yaml --set-file jupyterhub.hub.extraConfig.primehub=./primehub/jupyterhub_primehub.py"
happen warning:
cannot overwrite table with non table for extraEnv (map[])
cannot overwrite table with non table for extraEnv (map[])
cannot overwrite table with non table for extraEnv (map[])
...

kubectl get svc -n primehub
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hub ClusterIP 10.233.23.104 8081/TCP 41m
primehub-keycloak-headless ClusterIP None 80/TCP 41m
primehub-keycloak-http ClusterIP 10.233.7.25 80/TCP 41m
primehub-postgresql ClusterIP 10.233.27.174 5432/TCP 41m
proxy-api ClusterIP 10.233.48.32 8001/TCP 41m
proxy-public ClusterIP 10.233.26.185 80/TCP 41m

keycloak have response
curl 10.233.7.25
....
hub have not response
curl 10.233.26.185

please help me!!

use post-install hook to run bootstrap jobs

bootstrap jobs including

  • create a realm
  • create a client for jupytherhub and store the secret
  • create a default user with profiles

should be idempotent

implementations

  • post-install hook to bootstrap
  • idempotent (if {realm, client, user} has already created, skip to setup it)

known issues

  • user should remove unmanaged primehub-secret secret
  • helm upgrade might break keycloak in-memory storage deployment (no admin created)

[Bug]failed processing release primehub: failed to render values files "primehub.yaml.gotmpl": failed to render [primehub.yaml.gotmpl], because of template: stringTemplate:12:30: executing "stringTemplate" at <requiredEnv "PRIMEHUB_STORAGE_CLASS">: error calling requiredEnv: required env var `PRIMEHUB_STORAGE_CLASS` is not set

[root@primehub install]# ./primehub-install create primehub --primehub-version v3.6.2 --primehub-ce
[Preflight Check]
[Skip] Domain Name Check:
No resources found in default namespace.
[Preflight Check] Pass
[Verify] Mininal k8s resources
Total Resources: ( k8s node x 1 )
CPU: 4000m, Memory: 15Gi, GPU: 0
[Skip] Domain Name Check: /healthz
[Install] PrimeHub
[Check] primehub.yaml
[Found] /root/.primehub/config/kubernetes-admin@kubernetes/helm_override/primehub.yaml
[Install] PrimeHub
Adding repo infuseai https://charts.infuseai.io
"infuseai" has been added to your repositories

Affected releases are:
primehub (infuseai/primehub) UPDATED

in /root/temp/primehub-master/install/helmfiles/primehub/helmfile.yaml: failed processing release primehub: failed to render values files "primehub.yaml.gotmpl": failed to render [primehub.yaml.gotmpl], because of template: stringTemplate:12:30: executing "stringTemplate" at <requiredEnv "PRIMEHUB_STORAGE_CLASS">: error calling requiredEnv: required env var PRIMEHUB_STORAGE_CLASS is not set
[Error] Generate PrimeHub error log at primehub.log.

I see set PRIMEHUB_STORAGE_CLASS But I don't know what he wants

export PRIMEHUB_STORAGE_CLASS="$(kubectl get sc | grep 'default' | cut -d ' ' -f 1)"

[Bug] INSTALLATION FAILED: chart requires kubeVersion: >=1.19.0-0 which is incompatible with Kubernetes v1.16.15

What happened:

I follow this instruction to install the K8s v1.16.15 on my workstation.

But during the installation of ingress-nginx/ingress-nginx, it warns that it requires kubeVersion >= 1.19

Error: INSTALLATION FAILED: chart requires kubeVersion: >=1.19.0-0 which is incompatible with Kubernetes v1.16.15

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

$ helm install nginx-ingress ingress-nginx/ingress-nginx --create-namespace --namespace ingress-nginx --set controller.hostNetwork=true --set rbac.create=true

Error: INSTALLATION FAILED: chart requires kubeVersion: >=1.19.0-0 which is incompatible with Kubernetes v1.16.15

Anything else we need to know?:

Environment:

  • PrimeHub version (use helm ls):
  • Kubernetes version (use kubectl version): v1.16.15
  • Cloud provider or hardware configuration: on-premise
  • OS (e.g: cat /etc/os-release): Ubuntu Server 20.04.3
  • Kernel (e.g. uname -a): 5.4.0-89-generic

Change the configuration value `primehub.domain` to `primehub.keycloak.url`

The reason is

  1. The domain is only used in keycloak url setting
  2. The domain may be different for hub and keycloak. It is confusing which domain should be set.
  3. For exteranl keycloak use case, we should set the full url for keycloak
  4. Current design, we can not support both http and https. The scheme of keycloak url is hard coded.

If helm delete and reinstall, it will use old secret.

Reproduce steps

  1. helm upgrade --install
  2. helm delete
  3. helm upgrade --install
  4. go to hub page

Current

  1. Show 500
  2. The client secret is wrong

Expected

No error

Note

The primehub-secret should be recreated after installation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.