Code Monkey home page Code Monkey logo

minio-operator's Introduction

MinIO Operator

Overview

This charm encompasses the Kubernetes operator for MinIO (see CharmHub).

The MinIO operator is a Python script that wraps the latest released MinIO, providing lifecycle management for each application and handling events such as install, upgrade, integrate, and remove.

Install

To install MinIO, run:

juju deploy minio

For more information, see https://juju.is/docs

MinIO console

Minio console is available under port 9001. To change this port use configuration variable console-port, run:

juju config minio console-port=9999

For more information, see minio-console documentation

Operation Modes

MinIO can be operated in the following modes:

  • server (default): MinIO stores any data "locally", handling all aspects of the data storage within the deployed workload and storage in cluster
  • gateway: MinIO works as a gateway to a separate blob storage (such as Amazon S3), providing an access layer to your data for in-cluster workloads

Example using gateway mode

This charm supports using the following backing data storage services:

  • s3
  • azure

To install MinIO in gateway mode for s3, run:

juju deploy minio minio-s3-gateway \
    --config mode=gateway \
    --config gateway-storage-service=s3 \
    --config access-key=<aws_s3_access_key> \
    --config secret-key=<aws_s3_secret_key>

To install MinIO in gateway mode for azure, run:

juju deploy minio minio-azure-gateway \
    --config mode=gateway \
    --config gateway-storage-service=azure \
    --config access-key=<azurestorageaccountname> \
    --config secret-key=<azurestorageaccountkey>

In case of using private endpoints for storage service specify storage-endpoint-service. This configuration is optional in case of using S3 or Azure public endpoints.

By default, the backing storage credentials are also used as the credentials to connect to the MinIO gateway itself. If you do not want to share your data storage service credentials with users, you can create users in the MinIO console with proper permissions for them.

For more information, see: https://docs.min.io/docs/minio-multi-user-quickstart-guide.html

The credentials access-key and secret-key differs for Azure and AWS. Improper credential error will be visible in container logs.

For more information, see: https://docs.min.io/docs/minio-gateway-for-azure.html and https://docs.min.io/docs/minio-gateway-for-s3.html

Charm Release Versioning

Note: Rather than versioning this charm by the workload itself, releases for this charm are versioned with ckf-x.y, indicating the Charmed Kubeflow version they're released with.

minio-operator's People

Contributors

barteus avatar beliaev-maksim avatar ca-scribner avatar colmbhandal avatar dnplas avatar dparv avatar i-chvets avatar jardon avatar johnsca avatar kimwnasptd avatar knkski avatar kwmonroe avatar misohu avatar natalian98 avatar neoaggelos avatar nohaihab avatar orfeas-k avatar phoevos avatar renovate[bot] avatar variabledeclared avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

minio-operator's Issues

Minio does not restart minio if someone changes `access-key`/`secret-key` via `juju config`

When credentials are updated via juju config minio access-key=somethingNew, the credential secret in k8s is updated but this secret update does not automatically get used by the minio workload’s pod because a change to a secret used by a pod does not cause the pod to restart and reload the secret.

Possible solutions:

  • have the charm initiate a restart of the pod whenever the config changes. This could be achieved directly via lightkube or maybe juju, or indirectly by using a spec annotation that includes a hash of the config
  • similar to above, this solution where you treat your configmap as immutable and on any config change create a new configmap and then modify the existing deployment to point to the new configmap. k8s will only scale down if the configmap is valid
  • use an existing controller like Reloader that automatically restarts workloads when secrets are refreshed.

Note that any solution restarting minio might result in downtime/interruption. The actual minio workload will go down. This shoudl be documented properly. There might be more nuanced solutions, too (can we communicate with the running minio workload to tell it to update the credentials? That sounds like a possible sidecar thing).

`minio-service` added for Charmed Kubeflow 1.8 should be moved to kfp-api to avoid bugs with deploying multiple minios at once

Bug Description

[the minio-service added to minio charm to fix an upstream kfp bug]

while prepping for the KF 1.8 release, minio-operator 151 added a new service called minio-service to the minio charm. This service was added because upstream kubeflow pipelines has an issue where it hard-codes the minio service name.

I propose we revert minio-operator 151 and instead fix the bug in kfp by adding this svc/minio-service to the kfp-api charm. The main reasons are:

  • as-is, we cannot deploy two instances of the minio charm in the same model (both minio's will deploy a service of the same name, at best with one overwriting the other). This is an issue because sometimes we have a minio for kubeflow + another minio for mlflow
  • as-is, the service that fixes a kfp bug is added for everyone, not just kfp. If we deploy mlflow+minio, the service is added anyway even though it isn't needed

In general, the minio-service is something not needed by minio, just to fix a kfp bug, so it doesn't feel right imo to put it in the minio charm

To Reproduce

Environment

Relevant Log Output

-

Additional Context

No response

Add integration test for upgrades

To avoid an issue with upgrades like we found when solving #78, we should have an integration test that:

  • deploys minio
  • puts some data in the storage
  • updates minio
  • confirms the data is still there

Have an action that returns the secret-key of MinIO

Context

Right now the secret-key config value of MinIO has a "" value as default. This then means that MinIO will create a random big password and use this value for the secret-key
https://github.com/canonical/minio-operator/blob/track/ckf-1.8/src/charm.py#L219-L234

Users should have a way in this case to be able to see the secret-key value that was autogenerated by the Charm.

Not that this change is only focused on the "interface" (actions) that users can use to get the value of secret-key. This could go hand in hand with #167, but can be a separate effort as well.

What needs to get done

  1. Have an action that returns the value of the secret-key that MinIO will have, which is either
    2. The value of the config
    3. The value that MinIO generated

Definition of Done

  1. Ensure the action returns the value, whether it was generated or not

Change in Minio credentials does not propagate to 'mlpipeline-minio-artifact' in user namespace

I'm not sure which component should be addressed here. I'm filling it against minio because the change starts from here.

When changing the minio creds it is changed in kubeflow namespace in secret mlpipeline-minio-artifact but it is not changed in the user's namespace.
A single user installation is also impacted.

Result
It is not possible to run workflows using KFP.

This step is in Error state with this message: Error (exit code 1): failed to put file: The Access Key Id you provided does not exist in our records.

Reproduce:
Deploy Kubeflow and after all is working change the access-key / secret-key for minio.

Workaround:
Copy the mlpipeline-minio-artifact from kubeflow namespace to user namespace

Missing `secret-key` config value validation

It is not clear from the charm docs that the secret-key needs to be at least 8 char long
It is also not very verbose from Juju POV what is actually happening on the charm

Ideally I would expect to have some config value validation on the charm to set it maybe on blocked state but avoid having the actual service down

Reproduce

juju config minio secret-key=minio

Logs

I cannot access MinIO website

$ juju status | grep minio
minio                      res:oci-image@1755999    waiting      1  minio                    ckf-1.7/stable  186  10.152.183.165  no       
mlflow-minio               res:oci-image@1755999    active       1  minio                    ckf-1.7/edge    186  10.152.183.108  no       
minio/0*                      error     idle   10.1.149.251  9000/TCP,9001/TCP  crash loop backoff: back-off 2m40s restarting failed container=minio pod=minio-0_kubeflow(1980c8fe-8cb3-4099-b9eb-2c6...
mlflow-minio/0*               active    idle   10.1.150.41   9000/TCP,9001/TCP
$ microk8s.kubectl get pods -n kubeflow | grep minio
minio-operator-0                                1/1     Running            0              40d
mlflow-minio-operator-0                         1/1     Running            0              67m
mlflow-minio-0                                  1/1     Running            0              66m
minio-0                                         0/1     CrashLoopBackOff   5 (105s ago)   6m24s
$ microk8s.kubectl logs -n kubeflow minio-0
Defaulted container "minio" out of: minio, juju-pod-init (init)
ERROR Unable to validate credentials inherited from the shell environment: Invalid credentials
      > Please provide correct credentials
      HINT:
        Access key length should be at least 3, and secret key length at least 8 characters

`minio` charm does not refresh relations if relation is removed

If minio is related over the object-storage relation to something that does not fulfil the relation contract (for example, it does not send the SDI versions data), it will be stuck like:

minio/0*   waiting   idle   10.1.208.162  9000/TCP,9001/TCP  List of <ops.model.Relation object-storage:0> versions not found for apps: kfp-ui

If we then break the relation (juju remove-relation minio kfp-ui), the charm status remains as seen above. This is probably because we don't observe the relation broken event, or don't account for how the relation-broken event may or may not have the departing application's data in it.

The minio charm does work properly when we re-relate to something that functions properly, or likely on a config-changed event

Note: this issue likely affects other charms as well

Use JuJu application-secrets for MinIO credentials

Context

Right now the secret-key config value of MinIO has a "" value as default. This then means that MinIO will create a random big password and use this value for the secret-key
https://github.com/canonical/minio-operator/blob/track/ckf-1.8/src/charm.py#L219-L234

Those values now are generated by the Charm and are handled as config options. We should move to using juju application-secrets for storing these secrets, to make their handling more secure.

We propose that we currently go with application secrets (and not user secrets) since we would not expect users for now to need to update the credential values of MinIO.

What needs to get done

  1. Convert the secret-key and access-key from config options to application secrets
  2. Keep the logic of autogenerating the values, so MinIO can generate secure ones

For this work though we might need to keep the effort of being compliant with s3-interface #160

Definition of Done

  1. Secrets are used for the sensitive values of MinIO
  2. Spike for exploring if the Charm should generate values by default (we believe yes, but let's get feedback from DP team)

Support the `s3` charm relation interface

Context

It currently is not possible to integrate s3 requirers with minio using the s3 charm relation interface. Adding support for this interface to the minio-k8s charm would make it much easier to integrate requirers (like vault-k8s) with minio directly, improving user experience.

Reference

test_prometheus_data_set unit test failed

After upgrading prometheus scrape library to v0.30 in PR #109 , getting this error in unit tests:

Traceback (most recent call last):
  File "/home/runner/work/minio-operator/minio-operator/tests/unit/test_charm.py", line 410, in test_prometheus_data_set
    assert json.loads(harness.get_relation_data(rel_id, harness.model.app.name)["scrape_jobs"])[0][
KeyError: 'scrape_jobs'

update config hash unit tests

The config hash unit tests no longer passes after upgrading ops from 1.2 to 1.4. update_config unset functionality changes from removing the config value to resetting to config.yaml defaults.
A new solution is needed for removing configs.

Add `ingress` relation to expose console, api

#36 enabled accessing the minio console through a specified port on the container. This should be exposed through an ingress relation to make it easier to access.

The minio api can also be exposed through the ingress, although I'm not sure how that would work when authentication was pulled into the loop

Make charm's images configurable in track/<last-version> branch

Description

The goal of this task is to make all images configurable so that when this charm is deployed in an airgapped environment, all image resources are pulled from an arbitrary local container image registry (avoiding pulling images from the internet).
This serves as a tracking issue for the required changes and backports to the latest stable track/* Github branch.

TL;DR

Mark the following as done

  • Required changes (in metadata.yaml, config.yaml, src/charm.py)
  • Test on airgap environment
  • Publish to /stable

Required changes

WARNING: No breaking changes should be backported into the track/<version> branch. A breaking change can be anything that requires extra steps to refresh from the previous /stable other than just juju refresh. Please avoid at all costs these situations.

The following files have to be modified and/or verified to enable image configuration:

  • metadata.yaml - the container image(s) of the workload containers have to be specified in this file. This only applies to sidecar charms. Example:
containers:
  training-operator:
    resource: training-operator-image
resources:
  training-operator-image:
    type: oci-image
    description: OCI image for training-operator
    upstream-source: kubeflow/training-operator:v1-855e096
  • config.yaml - in case the charm deploys containers that are used by resource(s) the operator creates. Example:
apiVersion: v1
kind: ConfigMap
metadata:
  name: seldon-config
  namespace: {{ namespace }}
data:
  predictor_servers: |-
    {
        "TENSORFLOW_SERVER": {
          "protocols" : {
            "tensorflow": {
              "image": "tensorflow/serving", <--- this image should be configurable
              "defaultImageVersion": "2.1.0"
              },
            "seldon": {
              "image": "seldonio/tfserving-proxy",
              "defaultImageVersion": "1.15.0"
              }
            }
        },
...
  • tools/get-images.sh - is a bash script that returns a list of all the images that are used by this charm. In the case of a multi-charm repo, this is located at the root of the repo and gathers images from all charms in it.

  • src/charm.py - verify that nothing inside the charm code is calling a subprocess that requires internet connection.

Testing

  1. Spin up an airgap environment following canonical/bundle-kubeflow#682 and canonical/bundle-kubeflow#703 (comment)

  2. Build the charm making sure that all the changes for airgap are in place.

  3. Deploy the charms manually and observe the charm go to active and idle.

  4. Additionally, run integration tests or simulate them. For instance, creating a workload (like a PytorchJob, a SeldonDeployment, etc.).

Publishing

After completing the changes and testing, this charm has to be published to its stable risk in Charmhub. For that you must wait for the charm to be published to /edge, which is the revision to be promoted to /stable. Use the workflow dispatch for this (Actions>Release charm to other tracks...>Run workflow).

Suggested changes/backports

minio revisions>57 cannot be deployed in charmed kubernetes

Observed behaviour

minio ckf-1.6/beta hangs in a WaitingStatus for a long time and the storage that is attached to the unit remains in a pending status also. This causes minio to never be active.

juju status
minio/0*    waiting   idle    waiting for container

Steps to reproduce

juju add-model minio-test
juju deploy minio --channel ckf-1.6/beta
juju status

Environment

  • Charmed Kubernetes 1.22 on AWS
  • RBAC and Metallb enabled
  • Node constraints (kubernetes workers): kubernetes-worker cores=8 mem=32G root-disk=100G

Workaround

Remove the application and deploy an older version

juju remove-application minio
juju deploy minio --channel latest/stable

The share file option uses the Pod internal IP to share, which is inaccessible outside of the cluster

Go to the “Object Browser” tab, select the bucket and the file you want to create a link for, click on it to see its details. Select a share button on the top right and copy the share link from the popup window.
minio

Expected
Use the IP/URL which will be accessible outside the cluster.

Workaround
Exchange the first part of URL with the "localhost" if port-forward or your external IP where it was exposed.

Minio failed to upgrade 1.6 to 1.7

Minio failed to upgrade with error message:

ERROR Juju on containers does not support updating storage on a statefulset.
The new charm's metadata contains updated storage declarations.
You'll need to deploy a new charm rather than upgrading if you need this change.
 not supported (not supported)

Jira

Should Minio provide relation to create a bucket?

In working on canonical/mlflow-operator#34, it was discussed whether or not a charm that wants to relate to and use the object store should be responsible (and thus have the code/tests around) creating it's own bucket, or if the minio charm should provide this somehow. This idea should be elaborated on.

Pros: to remove complexity in all the charms that would use object storage. Without central function for this in the minio charm, all other charms that need a bucket (eg: mlflow) need the logic to create their own if it does not exist.

Cons: not sure, but I think there are some. Not sure how this would map to multi-user scenarios (it isn't as complex for our single-user minio, but if we want per-user buckets etc this gets much more complex).

Add charm for minio console (web portal for minio)

Logging this here as it relates to minio, although it does not really affect this charm specifically.

We could add an additional charm to provide the minio console (similar to the Argo server charm)

`minio` fails to build during integration tests

Seems like minio cannot be built during integration tests, resulting in the following error message:

RuntimeError: Failed to build charm .:
Packing the charm.
Launching environment to pack for base name='ubuntu' channel='20.04' architectures=['amd64'] (may take a while the first time but it's reusable)
Packing the charm
Packing the charm.
Building charm in '/root'
Running step PULL for part 'charm'
Running step BUILD for part 'charm'
Parts processing error: Failed to run the build script for part 'charm'.
Failed to build charm for bases index '0'.

For details and logs see here and follow the latest CI runs of #134

This seems to be affecting the publish job of that PR as well.

Move backup/restore to the Charm

Context

similarly to canonical/mlmd-operator#80

Right now our backup/restore guide has the following issues

  1. It has manual commands the user needs to run to push the data to S3 (rclone)
  2. Users need to download binaries to parse the secret-key and download the data locally (rclone)
  3. The data needs to go via the host-machine that runs the rclone command, for which we use kubectl port-forward

We should move all of this logic to the Charm, to ensure users don't need to install any binaries and the data will go directly from the Charm to S3

What needs to get done

  1. Include all needed binaries (or python libraries) for backup to the Charm
  2. Have an action that pushes the buckets of MinIO to another S3

Definition of Done

  1. Have a spike to confirm our understanding of rclone sync, and which files it copies
  2. Have a spike to discuss if we want to copy all files, or ones that fit a timeframe (could get field input)
  3. The action can be executed in an airgap environment
  4. Users don't need to run any other commands from their machine
  5. The data will go directly to the S3 from the Charm

HA storage for MinIO

I understand MinIO charm does not support to be clustered (App POV)
Also I understand that to have HA storage (Storage POV) you could use gateway mode + S3 (Being public cloud or Ceph)

However, do we have a supported or tested way to do HA Storage when deploying on-premises (no Ceph involved)?

I know for COS Lite we have used Maystor Microk8s add-on
With this bug ATM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.