Code Monkey home page Code Monkey logo

dex-auth-operator's Introduction

Dex Auth Operator - a component of the Charmed Kubeflow distribution from Canonical

This repository hosts the Kubernetes Python Operator for Dex Auth (see CharmHub).

Upstream documentation can be found at https://github.com/dexidp/dex

Usage

The Dex Auth Operator may be deployed using the Juju command line as follows

juju deploy dex-auth

Looking for a fully supported platform for MLOps?

Canonical Charmed Kubeflow is a state of the art, fully supported MLOps platform that helps data scientists collaborate on AI innovation on any cloud from concept to production, offered by Canonical - the publishers of Ubuntu.

Kubeflow diagram

Charmed Kubeflow is free to use: the solution can be deployed in any environment without constraints, paywall or restricted features. Data labs and MLOps teams only need to train their data scientists and engineers once to work consistently and efficiently on any cloud – or on-premise.

Charmed Kubeflow offers a centralised, browser-based MLOps platform that runs on any conformant Kubernetes – offering enhanced productivity, improved governance and reducing the risks associated with shadow IT.

Learn more about deploying and using Charmed Kubeflow at https://charmed-kubeflow.io.

Key features

  • Centralised, browser-based data science workspaces: familiar experience
  • Multi user: one environment for your whole data science team
  • NVIDIA GPU support: accelerate deep learning model training
  • Apache Spark integration: empower big data driven model training
  • Ideation to production: automate model training & deployment
  • AutoML: hyperparameter tuning, architecture search
  • Composable: edge deployment configurations available

What’s included in Charmed Kubeflow 1.4

  • LDAP Authentication
  • Jupyter Notebooks
  • Work with Python and R
  • Support for TensorFlow, Pytorch, MXNet, XGBoost
  • TFServing, Seldon-Core
  • Katib (autoML)
  • Apache Spark
  • Argo Workflows
  • Kubeflow Pipelines

Why engineers and data scientists choose Charmed Kubeflow

  • Maintenance: Charmed Kubeflow offers up to two years of maintenance on select releases
  • Optional 24/7 support available, contact us here for more information
  • Optional dedicated fully managed service available, contact us here for more information or learn more about Canonical’s Managed Apps service.
  • Portability: Charmed Kubeflow can be deployed on any conformant Kubernetes, on any cloud or on-premise

Documentation

Please see the official docs site for complete documentation of the Charmed Kubeflow distribution.

Bugs and feature requests

If you find a bug in our operator or want to request a specific feature, please file a bug here: https://github.com/canonical/dex-auth-operator/issues

License

Charmed Kubeflow is free software, distributed under the Apache Software License, version 2.0.

Contributing

Canonical welcomes contributions to Charmed Kubeflow. Please check out our contributor agreement if you're interested in contributing to the distribution.

Security

Security issues in Charmed Kubeflow can be reported through LaunchPad. Please do not file GitHub issues about security issues.

dex-auth-operator's People

Contributors

agathanatasha avatar beliaev-maksim avatar ca-scribner avatar cjohnston1158 avatar colmbhandal avatar dnplas avatar dparv avatar i-chvets avatar johnsca avatar kimwnasptd avatar knkski avatar misohu avatar natalian98 avatar nohaihab avatar orfeas-k avatar phoevos avatar renovate[bot] avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dex-auth-operator's Issues

`config_hash` used to cause deployments to upgrade when config is changed is orphaned

This line appears to do nothing. I cannot find reference to config_hash in any of the applied manifests. In the past, this value was passed to a container as envConfig which I think triggered a rollout that used the updated config? I am guessing it works similar to this?

If we want a more general solution to this, we could also look at using Reloader, which can automatically trigger upgrades whenever it spots configmaps or secrets changing on monitored workloads

Integration tests fail in CI

Integration tests for prometheus and grafana integration in CI will occasionally fail.

Same PR tests ran two times:

The 1st failed because of response_metric = response["data"]["result"][0]["metric"] IndexError: list index out of range which implies that dex-auth is unavailable even though the juju status shows the app as active.

The following is the content of response_metric captured while running the test on an ec2 instance:

{'__name__': 'up', 'instance': 'test-kfp-update_b6d8e8b0-62f4-429b-8a4b-2dc58204fe88_dex-auth_dex-auth/0', 'job': 'juju_test-kfp-update_b6d8e8b_dex-auth_dex-auth_prometheus_scrape', 'juju_application': 'dex-auth', 'juju_charm': 'dex-auth', 'juju_model': 'test-kfp-update', 'juju_model_uuid': 'b6d8e8b0-62f4-429b-8a4b-2dc58204fe88', 'juju_unit': 'dex-auth/0'}

This test never failed locally or on an ec2 instance. Similar issue was observed for kfp-api integration.

Sidecar version of charm (current version in `main`) does not update config when `juju config` invoked

Charm currently does not update the actual dex-auth workload whenever a config change occurs at the charm level. The charm correctly handles the config change, updating the underlying configmaps, etc., but the workload that mounts the configmap is not restarted. This is similar to canonical/minio-operator#47. This can be demonstrated by:

juju config dex-auth connectors="this config should not work"`
# charm will fire config-changed and configmap will update, but workload itself will not break as expected
kubectl delete pod DEX-AUTH-WORKLOAD-POD-NAME
# New pod will be created to replace the deleted pod, but it will go into a crash loop and we can see the reason is the connector config by looking at `kubectl logs NEW-POD`

Uncaught test issue

When running the integration tests on microk8s with rbac enabled I see this:

INFO     test_charm:test_charm.py:83 Prometheus available at http://10.1.18.114:9090
INFO     test_charm:test_charm.py:86 Testing prometheus deployment (attempt 1)
INFO     test_charm:test_charm.py:97 Response status is success
PASSED
------------------------------------------------------------------------------------------- live log teardown -------------------------------------------------------------------------------------------
INFO     pytest_operator.plugin:plugin.py:768 Model status:

Model            Controller  Cloud/Region        Version  SLA          Timestamp
test-charm-kfom  controller  microk8s/localhost  2.9.29   unsupported  15:01:28Z

App                           Version                Status  Scale  Charm                         Channel     Rev  Address         Exposed  Message
dex-auth                                             active      1  dex-auth                                    0  10.152.183.138  no
grafana-k8s                                          active      1  grafana-k8s                   beta         18  10.152.183.221  no
istio-pilot                   res:oci-image@87fc646  active      1  istio-pilot                   1.5/stable   61  10.152.183.49   no
oidc-gatekeeper               res:oci-image@4e7f8dd  active      1  oidc-gatekeeper               stable       57  10.152.183.232  no
prometheus-k8s                                       active      1  prometheus-k8s                beta         20  10.152.183.35   no
prometheus-scrape-config-k8s                         active      1  prometheus-scrape-config-k8s  beta         18  10.152.183.99   no

Unit                             Workload  Agent  Address      Ports                                   Message
dex-auth/0*                      active    idle   10.1.18.108
grafana-k8s/0*                   active    idle   10.1.18.115
istio-pilot/0*                   active    idle   10.1.18.113  8080/TCP,15010/TCP,15012/TCP,15017/TCP
oidc-gatekeeper/0*               active    idle   10.1.18.112  8080/TCP
prometheus-k8s/0*                active    idle   10.1.18.114
prometheus-scrape-config-k8s/0*  active    idle   10.1.18.116


INFO     pytest_operator.plugin:plugin.py:774 Juju error logs:

controller-0: 15:00:22 ERROR juju.worker.caasapplicationprovisioner.runner exited "prometheus-k8s": Operation cannot be fulfilled on pods "prometheus-k8s-0": the object has been modified; please apply
your changes to the latest version and try again
unit-grafana-k8s-0: 15:00:52 ERROR unit.grafana-k8s/0.juju-log Unable to patch the Kubernetes service: Failed to patch k8s service: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '1d3e6a52-e09f-47
ef-9261-ebd6d4994646', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'e19f6a1d-9ec5-4ee9-9de1-bb38c278b8cf', 'Date': 'Tue, 31 May 2022 15:00:52 GMT', 'Content-Length': '353'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"services \"grafana-k8s\" is forbidden: User \"system:serviceaccount:test-charm-kfom:grafana-k8s\" cann
ot delete resource \"services\" in API group \"\" in the namespace \"test-charm-kfom\"","reason":"Forbidden","details":{"name":"grafana-k8s","kind":"services"},"code":403}

Seems like this should cause a test failure, and it seems like this should be fixed to work with rbac enabled.

$ snap list
Name        Version        Rev    Tracking       Publisher   Notes
charmcraft  1.6.0          904    latest/stable  canonical✓  classic
core18      20220428       2409   latest/stable  canonical✓  base
core20      20220512       1494   latest/stable  canonical✓  base
juju        2.9.31         19414  latest/stable  canonical✓  classic
kubectl     1.24.0         2433   latest/stable  canonical✓  classic
lxd         4.0.9-8e2046b  22753  4.0/stable/…   canonical✓  -
microk8s    v1.21.12       3202   1.21/stable    canonical✓  classic
multipass   1.9.2          7174   latest/stable  canonical✓  -
snapd       2.55.5         15904  latest/stable  canonical✓  snapd

Branch is main, commit 5ab49ca

Make charm's images configurable in track/<last-version> branch

Description

The goal of this task is to make all images configurable so that when this charm is deployed in an airgapped environment, all image resources are pulled from an arbitrary local container image registry (avoiding pulling images from the internet).
This serves as a tracking issue for the required changes and backports to the latest stable track/* Github branch.

TL;DR

Mark the following as done

  • Required changes (in metadata.yaml, config.yaml, src/charm.py)
  • Test on airgap environment
  • Publish to /stable

Required changes

WARNING: No breaking changes should be backported into the track/<version> branch. A breaking change can be anything that requires extra steps to refresh from the previous /stable other than just juju refresh. Please avoid at all costs these situations.

The following files have to be modified and/or verified to enable image configuration:

  • metadata.yaml - the container image(s) of the workload containers have to be specified in this file. This only applies to sidecar charms. Example:
containers:
  training-operator:
    resource: training-operator-image
resources:
  training-operator-image:
    type: oci-image
    description: OCI image for training-operator
    upstream-source: kubeflow/training-operator:v1-855e096
  • config.yaml - in case the charm deploys containers that are used by resource(s) the operator creates. Example:
apiVersion: v1
kind: ConfigMap
metadata:
  name: seldon-config
  namespace: {{ namespace }}
data:
  predictor_servers: |-
    {
        "TENSORFLOW_SERVER": {
          "protocols" : {
            "tensorflow": {
              "image": "tensorflow/serving", <--- this image should be configurable
              "defaultImageVersion": "2.1.0"
              },
            "seldon": {
              "image": "seldonio/tfserving-proxy",
              "defaultImageVersion": "1.15.0"
              }
            }
        },
...
  • tools/get-images.sh - is a bash script that returns a list of all the images that are used by this charm. In the case of a multi-charm repo, this is located at the root of the repo and gathers images from all charms in it.

  • src/charm.py - verify that nothing inside the charm code is calling a subprocess that requires internet connection.

Testing

  1. Spin up an airgap environment following canonical/bundle-kubeflow#682 and canonical/bundle-kubeflow#703 (comment)

  2. Build the charm making sure that all the changes for airgap are in place.

  3. Deploy the charms manually and observe the charm go to active and idle.

  4. Additionally, run integration tests or simulate them. For instance, creating a workload (like a PytorchJob, a SeldonDeployment, etc.).

Publishing

After completing the changes and testing, this charm has to be published to its stable risk in Charmhub. For that you must wait for the charm to be published to /edge, which is the revision to be promoted to /stable. Use the workflow dispatch for this (Actions>Release charm to other tracks...>Run workflow).

Suggested changes/backports

Integrate Dex ROCK in CKF 1.7

What needs to get done

Once #172 is done then we'll need to use the ROCK from our Charm

Why it needs to get done

To ensure we use our ROCKs in the charm

dex-auth service won't initialise if no connectors are specified

Disabling the static login (through juju config dex-auth enable-password-db=false) will prevent the dex-auth service to start in the absence of a connector properly configured. The workload container will throw this message:
2023-08-07T08:39:31.967Z [dex] failed to initialize server: server: no connectors specified

And the juju debug-log will show:

  File "./src/charm.py", line 172, in _update_layer
    self._container.restart(self._container_name)
  File "/var/lib/juju/agents/unit-dex-auth-0/charm/venv/ops/model.py", line 1466, in restart
    self._pebble.restart_services(service_names)
  File "/var/lib/juju/agents/unit-dex-auth-0/charm/venv/ops/pebble.py", line 1699, in restart_services
    return self._services_action('restart', services, timeout, delay)
  File "/var/lib/juju/agents/unit-dex-auth-0/charm/venv/ops/pebble.py", line 1721, in _services_action
    raise ChangeError(change.err, change)
ops.pebble.ChangeError: cannot perform the following tasks:
- Start service "dex" (cannot start service: exited quickly with code 2)
----- Logs from task 0 -----
2023-08-07T08:47:10Z INFO Service "dex" has never been started.

Finally, changing this configuration will set the unit to error status without a proper recovery method:
dex-auth/0* error idle 10.1.235.140 hook failed: "config-changed"

All the above goes away when changing dex-auth's connectors config.

Proposed fix

Handle the case where the static login is disabled and no connector has been configured, providing enough logs and messaging to users.

Update `dex-auth-operator` manifests

Context

Each charm has a set of manifest files that have to be upgraded to their target version. The process of upgrading manifest files usually means going to the component’s upstream repository, comparing the charm’s manifest against the one in the repository and adding the missing bits in the charm’s manifest.

What needs to get done

https://docs.google.com/document/d/1a4obWw98U_Ndx-ZKRoojLf4Cym8tFb_2S7dq5dtRQqs/edit?pli=1#heading=h.jt5e3qx0jypg

Definition of Done

Manifests are updated
Upstream image is used

dex-auth-operator fails tests track/2.31

Description

Discovered during running on pull request workflows when not related image retrieval script was checked in.
Integration tests are not starting after many attempts.Fails on setup:

 /usr/bin/sg microk8s -c juju bootstrap --debug --verbose microk8s github-pr-08a8c --model-default test-mode=true --model-default automatically-retry-hooks=false --model-default logging-config="<root>=DEBUG" --agent-version=2.9.34 --bootstrap-constraints=""
  ERROR cannot load ssh client keys: mkdir /home/runner/.local: permission denied
  Error: The process '/usr/bin/sg' failed with exit code 2

This PR has the logs/traces for failing tests #135

Two problems were observed:

Need to review and address failing tests.

Solution

Solutions were proposed in canonical/bundle-kubeflow#648
SDI schema update solved the issue with integration tests.
canonical/serialized-data-interface#51
canonical/serialized-data-interface#51

dex-auth provides incorrect service name to ingress relation

When renaming the charm-created k8s objects (svc, etc), #28 introduced a bug where the service name provided to the ingress relation is incorrect. #28 added a suffix to all charm-created objects, but did not include it in the relation data resulting in istio creating virtual services pointed to the wrong k8s service.

Use Dex connector configurations for setting up multiple configuration values

Dex's configurations (e.g. IdP to connect to and its own config, staticPasswords, passwordDB, etc.) are done via connector configurations. These files are in yaml format and contain information that tells Dex how it should be configured for Authentication. An example of this is the GitHub connector.
Our current dex-auth-operator has a charm configuration option to pass connectors (I guess a yaml file in str format), which can be leveraged.

Proposal

To change a little bit how we configure Dex. Instead of going straight to setting up a static user and password, we ask users to provide a connector config. This applies to static configurations as well.

Potential tasks

  • We should revisit our charm code to ensure that applying a connector config will override whatever we have rn in the charm, and if not, make the corresponding changes (probably for 1.8?).
  • We probably want to completely remove the static config from dex-auth-operator. This will change the UX as we will have to pass a "dev" connector, like this one, to keep using dex the way we are used to, or really step up with new IdPs. This is done in upstream already.
  • Add instructions to the docs for setting up a connector.

Update Exception and Status Handling: Transient BlockedStatus when getting relation interface

NoVersionsListed is a transient exception raised when using SerializedDataInterface.get_interface(). However, the current code handles all exceptions from SerializedDataInterface.get_interface() with BlockedStatus. Integration test on relating istio-pilot and oidc-gatekeeper with raise_on_block would therefore fail.

Observed behaviour in test:
Test code

async def test_relations(ops_test: OpsTest):
    oidc_gatekeeper = "oidc-gatekeeper"
    istio_pilot = "istio-pilot"
    await ops_test.model.deploy(oidc_gatekeeper, config=OIDC_CONFIG)
    await ops_test.model.deploy(istio_pilot, channel="1.5/stable")
    await ops_test.model.add_relation(oidc_gatekeeper, APP_NAME)
    await ops_test.model.add_relation(f"{istio_pilot}:ingress", f"{APP_NAME}:ingress")

    await ops_test.model.wait_for_idle(
        [APP_NAME, oidc_gatekeeper, istio_pilot],
        status="active",
        raise_on_blocked=True,
        raise_on_error=True,
        timeout=600
    )

Failed test error

INFO     pytest_operator.plugin:plugin.py:276 Model status:

Model            Controller  Cloud/Region        Version  SLA          Timestamp
test-charm-npia  micro       microk8s/localhost  2.9.22   unsupported  10:32:35-05:00

App              Version                Status   Scale  Charm            Store     Channel     Rev  OS          Address         Message
dex-auth                                active       1  dex-auth         local                   0  ubuntu      10.152.183.249  
istio-pilot      res:oci-image@87fc646  waiting      1  istio-pilot      charmhub  1.5/stable   61  kubernetes                  waiting for container
oidc-gatekeeper                         waiting      1  oidc-gatekeeper  charmhub  stable       57  kubernetes                  Waiting for Client Secret

Unit                Workload  Agent      Address     Ports  Message
dex-auth/0*         blocked   executing  10.1.64.91         List of ingress versions not found for apps: istio-pilot
istio-pilot/0*      waiting   executing                     waiting for container
oidc-gatekeeper/0*  waiting   executing                     (config-changed) Waiting for Client Secret

To do:
Use kfp-operators as a reference and update status and exception handling in dex auth

Intermittently and randomly logged out of KF

Hi folks,
I'm seeing intermittent and random issues with session timeouts and being kicked out of Kubeflow by dex, leading to random failure (possibly triggered by a pod getting bounced, but not completely sure).

But then dex seems to get confused, and usually doesn't redirect me to the login screen but instead bombs with a generic unauthorised error => need to use an Incognito tab to be able to log back in again.

Please advise

Browser - Google Chrome Stable 93.0.4577.63-1
Baseline - MicroK8s 1.21/stable
Kubeflow - cs:kubeflow (current release version)
OS - Ubuntu 21.04 hirsute

Hardware - Host1: Mem: 15G, Swap 3G, i7 g11 @4 cores each with 2 threads @2.8Ghz

Fresh MicroK8s install:

sudo snap install microk8s --classic --channel=1.21
microk8s status --wait-ready
microk8s enable storage dns
microk8s enable kubeflow --bundle=cs:kubeflow
microk8s enable dashboard

Dex fails to parse oidc relation's client_id when container is replanned

Concerns PR #62
When using container.replan() in a sidecar charm, oidc relation gets broken without being explicitly noted in juju logs.
Bundle deployed on microk8s 1.21/stable with dns storage rbac ingress metallb:10.64.140.43-10.64.140.49 enabled:

bundle: kubernetes
name: kubeflow
applications:
  dex-auth:
    charm: "/home/ubuntu/dex-auth-operator/dex-auth_ubuntu-20.04-amd64.charm"
    resources:
      oci-image: "dexidp/dex:v2.31.2"
    scale: 1
    trust: true
  istio-ingressgateway:          { charm: istio-gateway,           channel: 1.5/stable, scale: 1, trust: true}
  istio-pilot:                   { charm: istio-pilot,             channel: 1.5/stable, scale: 1, options: { default-gateway: "kubeflow-gateway" } }
  kubeflow-dashboard:            { charm: kubeflow-dashboard,      channel: latest/edge, scale: 1 }
  kubeflow-profiles:             { charm: kubeflow-profiles,       channel: latest/edge, scale: 1 }
  oidc-gatekeeper:               { charm: oidc-gatekeeper,         channel: latest/stable, scale: 1 }
relations:
- [dex-auth:oidc-client, oidc-gatekeeper:oidc-client]
- [istio-pilot:ingress, dex-auth:ingress]
- [istio-pilot:ingress, kubeflow-dashboard:ingress]
- [istio-pilot:ingress, oidc-gatekeeper:ingress]
- [istio-pilot:ingress-auth, oidc-gatekeeper:ingress-auth]
- [istio-pilot:istio-pilot, istio-ingressgateway:istio-pilot]
- [kubeflow-profiles, kubeflow-dashboard]

Charms get active, but the logon page shows info about invalid oidc client.
The following can be observed in dex container logs:

$ kubectl logs dex-auth-0 -c dex -n kubeflow
[...]
2022-06-24T13:50:14.025Z [dex] time="2022-06-24T13:50:14Z" level=error msg="Failed to parse authorization request: Invalid client_id (\"authservice-oidc\")."

However, when you inspect the config file to which that relation data is sent, the client id seems to be correct:

$ kubectl exec -it dex-auth-0 -c dex -n kubeflow -- cat /etc/dex/config.docker.yaml
connectors: null
enablePasswordDB: true
issuer: http://10.64.140.43.nip.io/dex
logger:
  format: text
  level: debug
oauth2:
  skipApprovalScreen: true
staticClients:
- id: authservice-oidc
  name: Ambassador Auth OIDC
  redirectURIs:
  - /authservice/oidc/callback
  secret: VAVFN2LPMVAUY577DULT42HXYR2SNL
staticPasswords:
- email: admin
  hash: $2b$12$vXZg1qLLMAqIXiXSYKSN5.ohyeQTbuLhCWvB.8wo9aML3CuR6zPi6
  userID: fdef551a-7377-4445-b070-5f2fdaa0b48b
  username: admin
storage:
  config:
    inCluster: true
  type: kubernetes
web:
  http: 0.0.0.0:5556

This is similar to issue #31.
This issue is not observed when container.restart() is used instead.

Create a ROCK for Dex for CKF 1.7

What needs to get done

  1. Create a dedicated repo for dex-rock
  2. Create the rockcraft.yaml file
  3. Copy our CI automation for building the image on every PR and when a PR is merged

Why it needs to get done

In order for us to have our own built OCI Image for that component

Progress

Add license

Hi folks,
Please ensure that this repo has a license associated with it, concretely that there is a valid LICENSE file present. Please also consider adding license headers to all files as per generally accepted best practices.
Thanks

hook failed: "oidc-client-relation-broken"

I'm gettting this hook failed: "oidc-client-relation-broken" error consistently with the latest/stable version of dex-auth, when executing a config change with the command juju config dex-auth static-username="blah" static-password="blahblah".

It means that the new pod with the new configuration is never promoted to leader and the deployment is trapped in an broken state.

dex's oidc-gateway staticClients.id incorrect due to literal quotes

When connected to an oidc-client relation, the dex workload receives staticClients.id="authservice-oidc" in the dex-auth configmap and dex parses the quotes literally, meaning that it tired to log into oidc-gatekeeper with "authservivce-oidc" rather than authservice-oidc. If we try to log in with dex (for example, by connecting dex+oidc to an ingress/ingress-auth from istio), we see this error as dex passes the message from oidc-gatekeeper "Bad Request. Invalid client_id ("authservice-oidc")" (with escaped quotes included in the message). Messages about incorrect logins are also visible on the dex and oidc-gatekeeper workload pod kubectl logs.

Intermittently and randomly logged out of KF

Hi folks,
I'm seeing intermittent and random issues with session timeouts and being kicked out of Kubeflow by dex, leading to random failure (possibly triggered by a pod getting bounced, but not completely sure).

But then dex seems to get confused, and usually doesn't redirect me to the login screen but instead bombs with a generic unauthorised error => need to use an Incognito tab to be able to log back in again.

Please advise

Browser - Google Chrome Stable 93.0.4577.63-1
Baseline - MicroK8s 1.21/stable
Kubeflow - cs:kubeflow (current release version)
OS - Ubuntu 21.04 hirsute

Hardware - Host1: Mem: 15G, Swap 3G, i7 g11 @4 cores each with 2 threads @2.8Ghz

Fresh MicroK8s install:

sudo snap install microk8s --classic --channel=1.21
microk8s status --wait-ready
microk8s enable storage dns
microk8s enable kubeflow --bundle=cs:kubeflow
microk8s enable dashboard

add `http://` to public_url if it is missing

public_url without http:// or https:// would fail silently, without any errors in juju debug-log. It is easy to miss as browser automatically add http://, resulting in mismatched urls. It would be more user friendly if it is added when it is missing.

curling the public_url would result in 403. It is not apparent that it is failing from missing url scheme in public_url.

$ curl a457f5a1648e042908d1386f0a2057a9-687417382.us-east-1.elb.amazonaws.com -v
*   Trying 3.233.203.235:80...
* TCP_NODELAY set
* Connected to a457f5a1648e042908d1386f0a2057a9-687417382.us-east-1.elb.amazonaws.com (3.233.203.235) port 80 (#0)
> GET / HTTP/1.1
> Host: a457f5a1648e042908d1386f0a2057a9-687417382.us-east-1.elb.amazonaws.com
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 403 Forbidden
< date: Mon, 17 Jan 2022 18:27:30 GMT
< server: istio-envoy
< content-length: 0
<
* Connection #0 to host a457f5a1648e042908d1386f0a2057a9-687417382.us-east-1.elb.amazonaws.com left intact

Expected content of `connectors` config is not intuitive, add validation and documentation

See discussion here. The connectors config implements the config like shown here, but it expects only the content inside the connectors key, not the entire connectors: [ ... ]. This is not intuitive and has caught a few users.

We should either improve the documentation or add validation around this. As the likely user mistake is that someone provides connectors: [ ... ] instead of just [ ... ], we could easily check for a connectors key with nested array and unpack it (and similarly make sure the array contains what looks like valid connectors (maybe checking for a type and id?).

oidc-client-relation-broken hook failed

dex-auth charm goes into error state on an oidc-client-relation-broken hook failure. Will update the issue when I can determine exact steps to reproduce the issue.

2022-03-03 06:42:40 INFO juju-log Running legacy hooks/upgrade-charm.
2022-03-03 06:42:41 INFO juju.worker.caasoperator.uniter.dex-auth/5.operation runhook.go:152 ran "upgrade-charm" hook (via hook dispatching script: dispatch)
2022-03-03 06:42:41 INFO juju.worker.caasoperator.uniter.dex-auth/5 resolver.go:154 found queued "config-changed" hook
2022-03-03 06:42:43 ERROR juju-log oidc-client:2: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/model.py", line 1284, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/var/lib/juju/tools/unit-dex-auth-4/relation-get', '-r', '2', '-', '', '--app', '--format=json')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 209, in <module>
    main(Operator)
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/main.py", line 394, in main
    charm = charm_class(framework)
  File "./src/charm.py", line 44, in __init__
    self.interfaces = get_interfaces(self)
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/serialized_data_interface/__init__.py", line 263, in get_interfaces
    requires = {
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/serialized_data_interface/__init__.py", line 264, in <dictcomp>
    name: SerializedDataInterface(
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/serialized_data_interface/__init__.py", line 110, in __init__
    others = {
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/serialized_data_interface/__init__.py", line 111, in <dictcomp>
    app.name: bag.get("_supported_versions")
  File "/usr/lib/python3.8/_collections_abc.py", line 660, in get
    return self[key]
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/model.py", line 400, in __getitem__
    return self._data[key]
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/model.py", line 384, in _data
    data = self._lazy_data = self._load()
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/model.py", line 748, in _load
    return self._backend.relation_get(self.relation.id, self._entity.name, self._is_app)
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/model.py", line 1351, in relation_get
    return self._run(*args, return_output=True, use_json=True)
  File "/var/lib/juju/agents/unit-dex-auth-4/charm/venv/ops/model.py", line 1286, in _run
    raise ModelError(e.stderr)
ops.model.ModelError: b'ERROR "" is not a valid unit or application\n'
2022-03-03 06:42:43 ERROR juju.worker.caasoperator.uniter.dex-auth/4.operation runhook.go:146 hook "oidc-client-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1
2022-03-03 06:42:43 INFO juju.worker.caasoperator.uniter.dex-auth/4 resolver.go:150 awaiting error resolution for "relation-broken" hook

dex-auth2.log
juju-crashdump-677b46e1-2d75-4ed2-887a-989e22e1d807.zip
kf-bundle.zip

Remove installation of bcrypt in Charm code

BCrypt is installed during the run of the charm, which is a bad practice.

subprocess.check_call(["apt", "install", "-y", "python3-bcrypt"])

It is used to hash the password only. We can use other libraries to perform string hashing which does not require the installation of the additional package.

Additionally, packages installed this way are not scanned during the CVE scanning.

selenium: Replace Chrome Webdriver with Firefox

The integration tests are currently using the Chrome driver for selenium.

We have seen our CI fail due to issues with this driver:

selenium.common.exceptions.WebDriverException: Message: unknown error: no chrome binary at /opt/google/chrome/google-chrome

This is most probably what's causing the integration tests to fail:

WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:133 The chromedriver version (115.0.5790.110) detected in PATH at /usr/bin/chromedriver might not be compatible with the detected chrome version (115.0.5790.170); currently, chromedriver 115.0.5790.170 is recommended for chrome 115.*, so it is advised to delete the driver in PATH and retry

Goal

Replace the Chrome driver with Firefox, which is OSS, as well as the default browser in Ubuntu.
Note that this is what we're already using for the bundle-kubeflow tests.

Support for multiple static users

Type of issue
Feature Request

What
Multiple Static User Support

Why
Currently the charm configuration supports only one static username and a corresponding password:

 static_config = {
            "enablePasswordDB": True,
            "staticPasswords": [
                {
                    "email": static_username,
                    "hash": hashed,
                    "username": static_username,
                    "userID": self.state.user_id,
                }
            ],
        }

Dex auth supports multiple static users (they just need to be added to the staticPasswords list).

I understand that static password is only meant as a bootstrapping measure and not recommended for production, but having the ability to have multiple users would help better demonstrate the multi-user isolation feature of kubeflow.

How to use the patched version for 1.21

I have noticed #8 and dexidp/dex#2082 (comment)
since this patch has been merged, how can I use the newest version of dex-auth

for now,
I try

sudo microk8s juju upgrade-charm dex-auth

but got this

Added charm "cs:~kubeflow-charmers/dex-auth-107" to the model.
ERROR cannot upgrade application "dex-auth" to charm "cs:~kubeflow-charmers/dex-auth-107": would break relation "dex-auth:service-mesh istio-pilot:service-mesh"

Dex-auth units in error state due to oidc-client relation broken

Summary

After installing the kubeflow-lite bundle on MicroK8s, dex-auth units end up in an error state due to broken relation with oidc-client.

dex-auth units show the following traceback

Reproduction Steps

Install microK8s following this guide.
Then deploy the kubeflow bundle following this guide.

Versions:

snap list | grep -E '(juju|microk8s)'
juju               2.9.28                      18717  latest/stable    canonical*        classic
juju-crashdump     1.0.2+git100.fed9b56        258    latest/stable    jason-hobbs       classic
juju-kubectl       0.1.0                       15     latest/stable    kennethkoski      classic
juju-wait          2.8.4~2.8.4                 96     latest/stable    stub              classic
microk8s           v1.21.11                    3058   1.21/stable      canonical*        classic

Host OS is:

lsb_release -a                       
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.4 LTS
Release:	20.04
Codename:	focal

Introspection reports

Microk8s inspection report
Juju crashdump

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.