Code Monkey home page Code Monkey logo

bitpoke / mysql-operator Goto Github PK

View Code? Open in Web Editor NEW
1.0K 1.0K 274.0 40.81 MB

Asynchronous MySQL Replication on Kubernetes using Percona Server and Openark's Orchestrator.

Home Page: https://www.bitpoke.io/docs/mysql-operator/getting-started/

License: Apache License 2.0

Shell 3.09% Makefile 12.84% Go 81.49% Dockerfile 1.52% Python 0.40% Starlark 0.07% Mustache 0.23% Smarty 0.37%
kubernetes kubernetes-controller kubernetes-operator mysql mysql-async mysql-cluster operator percona-server

mysql-operator's People

Contributors

aacaz avatar amecea avatar bananaappletw avatar bluven avatar calind avatar chasebolt avatar cndoit18 avatar desero avatar dougfales avatar drivebyer avatar hbo2 avatar imriss avatar jgleonard avatar jianhaiqing avatar jiuker avatar kahirokunn avatar marratj avatar merceaotniel avatar mhumpula avatar milero avatar pedep avatar rstefan1 avatar smanpathak avatar stankevich avatar surajssd avatar tanordheim avatar tonybogdanov avatar vasiioandre avatar w33dw0r7d avatar yanjianbo1983 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mysql-operator's Issues

make image fails

make image has dependency on cmd/gen-crds-yaml, and expects its directory and dockerfile under build directory.

CGO_ENABLED=0 \ GOOS=linux \ GOARCH=amd64 \ go build -ldflags "-X github.com/presslabs/mysql-operator/pkg/util.AppGitState=dirty -X github.com/presslabs/mysql-operator/pkg/util.AppGitCommit=09e611034b4a166498414615f798a985ed3eb7d8 -X github.com/presslabs/mysql-operator/pkg/util.AppVersion=v0.1.4-dirty" \ -v -o bin/mysql-operator_linux_amd64 cmd/mysql-operator/main.go CGO_ENABLED=0 \ GOOS=linux \ GOARCH=amd64 \ go build -ldflags "-X github.com/presslabs/mysql-operator/pkg/util.AppGitState=dirty -X github.com/presslabs/mysql-operator/pkg/util.AppGitCommit=09e611034b4a166498414615f798a985ed3eb7d8 -X github.com/presslabs/mysql-operator/pkg/util.AppVersion=v0.1.4-dirty" \ -v -o bin/mysql-helper_linux_amd64 cmd/mysql-helper/main.go CGO_ENABLED=0 \ GOOS=linux \ GOARCH=amd64 \ go build -ldflags "-X github.com/presslabs/mysql-operator/pkg/util.AppGitState=dirty -X github.com/presslabs/mysql-operator/pkg/util.AppGitCommit=09e611034b4a166498414615f798a985ed3eb7d8 -X github.com/presslabs/mysql-operator/pkg/util.AppVersion=v0.1.4-dirty" \ -v -o bin/gen-crds-yaml_linux_amd64 cmd/gen-crds-yaml/main.go set -e; for cmd in mysql-operator mysql-helper gen-crds-yaml; do \ install -m 755 bin/${cmd}_linux_amd64 hack/docker/${cmd}/${cmd} ; \ done install: hack/docker/gen-crds-yaml/gen-crds-yaml: No such file or directory make: *** [install-docker] Error 71

Resync orchestrator registration

Currently if orchestrator is down during pod creation, it won't get registered into orchestrator. A better approach would be to continuously check the registration.

Move from ncat to http

Currently, passing around xtrabackup snaps is done trough netcat. This works as a prototype, but leveraging HTTP can improve security (eg. basic auth) and interoperability (eg. expose endpoint to ingress).

Add support for pt-kill

https://www.percona.com/doc/percona-toolkit/LATEST/pt-kill.html

We should expose some parameters in spec. Maybe couple it with #33.

spec:
  targetSLO:
    queryLimits:
      maxIdleTime: ...                # pt-kill --idle-time
      maxQueryTime: ...               # pt-kill --busy-time
      kill: oldest|all|all-but-oldest # pt-kill --victims
      killMode: query|connection      # pt-kill --kill-query or pt-kill --kill
      ignoreDb: []                    # pt-kill --ignore-db ...
      ignoreCommand: []               # pt-kill --ignore-command ...
      ignoreUser: []                  # pt-kill --ignore-user

Minimal documentation

  1. Project description, status (beta but used in production at Presslabs)
  2. Usage
    a. helm install ... (install the operator)
    b. kubectl apply -f examples/mysql-cluster.yaml (deploy a mysql cluster)
    c. scale the cluster
    d. how to access orchestrator
    e. make a backup
  3. Development
  4. License

Sync status from orchestrator

We should sync the cluster status from orchestrator:

status:
  nodes:
    - name: mysql-0
      conditions:
        - type: lagged
          status: true/false/unknown
          lastTransitionStatus: 2018-...
        - type: replicating
          status: true/false/unknown
          lastTransitionStatus: 2018-...
        - type: master
          status: true/false/unknown
          lastTransitionStatus: 2018-...
      ...
  conditions:
    - type: pendingFailoverAck
      status: true/false/unknown
      lastTransitionStatus: 2018-...

Add `--install-crds` flag

Also the helm chart should have a .Values.installCRDs which adds the proper permissions and starts the controller with --install-crds.

Handle password rotations

We should handle credential rotations, eg. updates of:

  1. utility user credentials
  2. root password
  3. user password
  4. replication password

Allow specifying minAvailable

Add a minAvailable filed in spec, which creates a PodDisruptionBudget. The default should 50% of replicas if replicas is greater than one.

If minAvailable is 0, delete the PDB.

Scalability

The StatefullSet should cread read-only replicas when scaled up.

Curious to know about production readiness

Hi,

This Operator seems interesting to us as it focuses on handling Percona. You mention in README that the operator is in alpha stage and that you are using it internally on non-critical workloads. I was wondering if you can elaborate on some of the essential missing pieces that would make it production ready.

Thanks!
Devdatta

Add safeguard for failed init

The initialization should be considered completed when a marker file exists. Otherwise the init process should cleanup everything and run again.

e2e tests

  • create cluster
  • cluster scale-up
  • cluster scale-down
  • node removed from service when latency > spec.maxSlaveLatency
  • node removed from service when not replicating
  • spec.readOnly flag with failover
  • update nodes when mysql config changes
  • change in cpu, memory (storage size) config of mysql object

Set backupURI into MysqlBackup spec field

The controller should write backup final URI into status field of the MysqlBackup object, and that should be the source of truth regarding the backup location.

kind: MysqlBackup
spec:
  cluster: ...
status:
  backupURI: gs://...
  backupTime: 2018-05-02T16:04:00Z
  complete: true

Generate ConfigMap for my.cnf

  • generate a ConfigMap with values from .spec.mysqlConfig (see #1)
  • at container init copy my.cnf from config map into /etc/mysql/my.cnf
  • at container init generate a /etc/mysql/conf.d/10-dynamic.cnf - with dyanmic settings (server-id, innodb-buffer-size)
  • at container init generate a /etc/mysql/conf.d/10-utility-user.cnf - with utility user credentials

The statefulset should be annotated with the configmap hash so that the server is redeployed when they change.

Initial operator

Objects required for creating a cluster:

DB credentials

apiVersion: v1
kind: Secret
metadata:
  name: my-cluster-db-credentials
type: Opaque
data:
  ROOT_PASSWORD: ~RANDOM
  REPLICATION_USER: ~replication
  REPLICATION_PASSWORD: ~RANDOM
  USER: ~DATABASE
  PASSWORD: ~RANDOM
  DATABASE: ~if exists and not empty, fill in USER and PASSWORD if not specified

Backup credentials (the init credentials use the same format)

apiVersion: v1
kind: Secret
metadata:
  name: backup-credentials
type: Opaque
data:
  AWS_ACCESS_KEY_ID:
  AWS_SECRET_ACCESS_KEY:
  S3_STORAGE_CLASS: (if not specified STANDARD_IA)
  S3_REGION: (if not specified us-east-1)
  S3_ENDPOINT_URL:
  GOOGLE_SERVICE_ACCOUNT_JSON_KEY:
  GOOGLE_PROJECT_ID:
  GCS_STORAGE_CLASS: (if not specified COLDLINE)

Mysql cluster CRD

apiVersion: titanium.presslabs.net/v1alpha1
kind: MysqlCluster
metadata:
  name: my-cluster
spec:
  readReplicas: 0
  secretName: ^^MysqlCluster.name-db-credentials
  mysqlVersion: 5.7.10

  initBucketURI: gs://..., s3://
  initBucketSecretName: ~

  mysqlConf:
    skip-character-set-client-handshake: true
    ...

  backupBucketURI: gs://name/prefix/a/b, s3://name/prefix/a/b
  backupBucketSecretName: ~
  backupSchedule: ""
  podSpec:
    labels:
      app: mysql
      titanium-cluster: ^^MysqlCluster.name
    annotations: {}
    resources: {}
    afinity: {}
    nodeSelector: {}
  volumeClaimTemplate:
    # https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims

This should create/update the following resources (in this order):

  1. DB Credentials secret
  2. A secret with utility user credentials
  3. Headless service
  4. Statefull set

We should continuously sync the StatefullSet status with our CRD.

Introduce the concept of target SLOs

For example, the cluster spec should specify that any slave lag is at most X seconds. We can use this it take the node out of rotation or automatically acknowledge node recovery in orchestrator.

spec:
  targetSLO:
    maxSlaveLatency: 10s

Sane defaults

Allow setting the cluster readOnly

We should add a spec.readOnly which would set the entire cluster read-only (even the master node).

The controller should make sure that once the readOnly condition in status is set to true, there is no way the master becomes writable. It is ok to have a spec.readOnly=false cluster with all it's pod read-only, but it's a bug to have a spec.readOnly=true and status.conditions[readOnly]=true with writable pods.

For this we need to do the following:

  1. all cluster pods start read-only
  2. the controller is responsible of making the master pod writable
  3. make sure that orchestrator is started with ApplyMySQLPromotionAfterMasterFailover set to false

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.