Code Monkey home page Code Monkey logo

crunchydata / postgres-operator-examples Goto Github PK

View Code? Open in Web Editor NEW
183.0 29.0 4.5K 540 KB

Examples for deploying applications with PGO, the Postgres Operator from Crunchy Data

Home Page: https://access.crunchydata.com/documentation/postgres-operator/v5/

License: Apache License 2.0

Smarty 31.01% Handlebars 68.99%
operator kubernetes-operator postgres postgresql database high-availability kubernetes postgres-operator database-management disaster-recovery

postgres-operator-examples's Introduction

Examples for Using PGO, the Postgres Operator from Crunchy Data

This repository contains a collection of installers and examples for deploying, operating and maintaining Postgres clusters using PGO, the Postgres Operator from Crunchy Data as part of Crunchy Postgres for Kubernetes.

The use of these examples with PGO and other container images (aside from those provided by Crunchy Data) will require modifications of the examples.

Using these Examples

The examples are grouped by various tools that can be used to deploy them. Each of the examples has its own README that guides you through the process of deploying it. The best way to get started is to fork this repository and experiment with the examples. The examples as provided are designed for the use of PGO along with Crunchy Data's Postgres distribution, Crunchy Postgres, as Crunchy Postgres for Kubernetes. For more information on the use of container images downloaded from the Crunchy Data Developer Portal or other third party sources, please see 'License and Terms' below.

Help with the Examples

  • For general questions or community support, we welcome you to join our community Discord.
  • If you believe you have discovered a bug, please open an issue in the PGO project.
  • You can find the full Crunchy Postgres for Kubernetes documentation here.
  • You can find out more information about PGO, the Postgres Operator from Crunchy Data, at the project page.

FAQs, License and Terms

For more information regarding PGO, the Postgres Operator project from Crunchy Data, and Crunchy Postgres for Kubernetes, please see the frequently asked questions.

For information regarding the software versions of the components included and Kubernetes version compatibility, please see the components and compatibility section of the Crunchy Postgres for Kubernetes documentation.

The examples provided in this project repository are available subject to the Apache 2.0 license with the PGO logo and branding assets covered by our trademark guidelines.

The examples as provided in this repo are designed for the use of PGO along with Crunchy Data's Postgres distribution, Crunchy Postgres, as Crunchy Postgres for Kubernetes. The unmodified use of these examples will result in downloading container images from Crunchy Data repositories - specifically the Crunchy Data Developer Portal. The use of container images downloaded from the Crunchy Data Developer Portal are subject to the Crunchy Data Developer Program terms.

postgres-operator-examples's People

Contributors

akirill0v avatar alex1989hu avatar andrewlecuyer avatar benjaminjb avatar cbandy avatar cbrianpace avatar dajeffers avatar derlin avatar dsessler7 avatar james-callahan avatar jfhenriques avatar jkatz avatar jmckulk avatar joedo avatar jyanez900 avatar leonsteinhaeuser avatar pratikbin avatar rgherta avatar sathieu avatar scott-grimes avatar tjmoore4 avatar tony-landreth avatar tsykoduk avatar valclarkson avatar yk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

postgres-operator-examples's Issues

Metrics endpoint is not exposed by Service

I found during Postgres Operator startup that the metrics server is binded to 8080 port but unfortunately it is not exposed by k8s Service.

Actual log line:

time="2021-11-19T12:44:10Z" level=info msg="metrics server is starting to listen" addr=":8080" file="sigs.k8s.io/[email protected]/pkg/log/deleg.go:130" func="log.(*DelegatingLogger).Info" version=5.0.3-0

Prometheus metrics are available at :8080/metrics URL.

Helm do not update CRDs

https://helm.sh/docs/topics/charts/

Limitations on CRDs
Unlike most objects in Kubernetes, CRDs are installed globally. For that reason, Helm takes a very cautious approach in managing CRDs. CRDs are subject to the following limitations:

CRDs are never reinstalled. If Helm determines that the CRDs in the crds/ directory are already present (regardless of version), Helm will not attempt to install or upgrade.
CRDs are never installed on upgrade or rollback. Helm will only create CRDs on installation operations.
CRDs are never deleted. Deleting a CRD automatically deletes all of the CRD's contents across all namespaces in the cluster. Consequently, Helm will not delete CRDs.
Operators who want to upgrade or delete CRDs are encouraged to do this manually and with great care.

PGBouncer config

Can we atm customize the pgbouncer settings like max_conn etc ?
the config file is managed by the operator and I see nothing helpful in the CRDs.

configmap hippo-ssh-config is missing

I am not sure if this is a bug, but I just upgraded from 5.0.2 to 5.0.3 using helm upgrade command.
However, the first pod is trying to be rescheduled but fails with some missing config maps.
I didn't change anything on my config so far

pe     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    40s               default-scheduler  Successfully assigned postgresql/hippo-instance1-xqp6-0 to k8s-production-fr-standard-node-b4452d
  Warning  FailedMount  8s (x7 over 40s)  kubelet            MountVolume.SetUp failed for volume "ssh" : [configmap "hippo-ssh-config" not found, secret "hippo-ssh" not found]
  Warning  FailedMount  8s (x7 over 40s)  kubelet            MountVolume.SetUp failed for volume "pgbackrest-config" : configmap references non-existent config key: pgbackrest_instance.conf

Not sure is this is an undetected issue in the upgrade path.

Connect to application in default namespace

I followed the documentation to create a postgress "hippo" cluster which is in the "postgress-operator" namespace, is there a way to connect the users made in the postgress Cluster file to applications in the "default" namespace? Since the generated pguser secrets are only in the postgress-operator namespace. I have looked through the documentation provided at https://access.crunchydata.com/documentation/postgres-operator/5.0.5/ and have been unable to find much about namespaces.

Undefined Variable causes Helm Template Failure

Due to undefined variable check introduced in #64 the Helm Chart is broken.

Let me copy what I commented to the PR:

helm template . --debug
install.go:178: [debug] Original chart version: ""
install.go:199: [debug] CHART PATH: /home/alex/other/postgres-operator-examples/helm/install


Error: template: pgo/templates/manager.yaml:23:23: executing "pgo/templates/manager.yaml" at <eq .Values.debug false>: error calling eq: incompatible types for comparison
helm.go:88: [debug] template: pgo/templates/manager.yaml:23:23: executing "pgo/templates/manager.yaml" at <eq .Values.debug false>: error calling eq: incompatible types for comparison

Upgrade issue : "error": "no matches for kind \"PGUpgrade\" in version \"postgres-operator.crunchydata.com/v1beta1\

I had upgraded helm:
from:

version: 0.2.0
appVersion: 5.0.3

to:

version: 0.3.0
appVersion: 5.1.0
helm upgrade postgres-operator -n postgres-operator helm/install

Release "postgres-operator" has been upgraded. Happy Helming!
NAME: postgres-operator
LAST DEPLOYED: Mon May 16 10:24:24 2022
NAMESPACE: postgres-operator
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Thank you for deploying PGO v5.1.0!

Although seems that it was not upgraded successfully.

k logs pgo-upgrade-68b4797d7f-lzjv8 -n postgres-operator
I0516 08:30:36.185914       1 request.go:655] Throttling request took 1.03163628s, request: GET:https://10.43.0.1:443/apis/rbac.authorization.k8s.io/v1?timeout=32s
2022-05-16T08:30:36.194Z        INFO    controller-runtime.metrics      metrics server is starting to listen    {"addr": ":0"}
2022-05-16T08:30:36.195Z        INFO    setup   starting manager
2022-05-16T08:30:36.195Z        INFO    controller-runtime.manager      starting metrics server {"path": "/metrics"}
2022-05-16T08:30:36.195Z        INFO    controller-runtime.manager.controller.pgupgrade Starting EventSource    {"reconciler group": "postgres-operator.crunchydata.com", "reconciler kind": "PGUpgrade", "source": "kind source: /, Kind="}
2022-05-16T08:30:38.740Z        ERROR   controller-runtime.source       if kind is a CRD, it should be installed before calling Start   {"kind": "PGUpgrade.postgres-operator.crunchydata.com", "error": "no matches for kind \"PGUpgrade\" in version \"postgres-operator.crunchydata.com/v1beta1\""}
sigs.k8s.io/controller-runtime/pkg/log.(*DelegatingLogger).Error
        sigs.k8s.io/[email protected]/pkg/log/deleg.go:144
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start
        sigs.k8s.io/[email protected]/pkg/source/source.go:117
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
        sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:167
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
        sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1
        sigs.k8s.io/[email protected]/pkg/manager/internal.go:681
2022-05-16T08:30:38.741Z        ERROR   setup   problem running manager {"error": "no matches for kind \"PGUpgrade\" in version \"postgres-operator.crunchydata.com/v1beta1\""}
sigs.k8s.io/controller-runtime/pkg/log.(*DelegatingLogger).Error
        sigs.k8s.io/[email protected]/pkg/log/deleg.go:144
main.main
        github.com/crunchydata/priv-all-pgo/cmd/priv-all-pgo/main.go:104
runtime.main
        runtime/proc.go:255

Keycloak example Crashloopbackoff

I have run both the keycloak example and .\kustomize\keycloak and also followed the Postgres tutorial where you provide a custom script for installing keycloak. In both cases, keycloak endsup in Crashloopbackoff, but there is no error. The logs in the container show the following

Keycloak - Open Source Identity and Access Management
--
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Find more information at: https://www.keycloak.org/docs/latest
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Usage:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | kc.sh [OPTIONS] [COMMAND]
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Use this command-line tool to manage your Keycloak cluster.
Fri, Apr 15 2022 2:20:13 pm | Make sure the command is available on your "PATH" or prefix it with "./" (e.g.:
Fri, Apr 15 2022 2:20:13 pm | "./kc.sh") to execute from the current folder.
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Options:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | -cf, --config-file <file>
Fri, Apr 15 2022 2:20:13 pm | Set the path to a configuration file. By default, configuration properties are
Fri, Apr 15 2022 2:20:13 pm | read from the "keycloak.conf" file in the "conf" directory.
Fri, Apr 15 2022 2:20:13 pm | -h, --help This help message.
Fri, Apr 15 2022 2:20:13 pm | -v, --verbose Print out error details when running this command.
Fri, Apr 15 2022 2:20:13 pm | -V, --version Show version information
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Commands:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | build Creates a new and optimized server image.
Fri, Apr 15 2022 2:20:13 pm | start Start the server.
Fri, Apr 15 2022 2:20:13 pm | start-dev Start the server in development mode.
Fri, Apr 15 2022 2:20:13 pm | export Export data from realms to a file or directory.
Fri, Apr 15 2022 2:20:13 pm | import Import data from a directory or a file.
Fri, Apr 15 2022 2:20:13 pm | show-config Print out the current configuration.
Fri, Apr 15 2022 2:20:13 pm | tools Utilities for use and interaction with the server.
Fri, Apr 15 2022 2:20:13 pm | completion Generate bash/zsh completion script for kc.sh.
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Examples:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Start the server in development mode for local development or testing:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | $ kc.sh start-dev
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Building an optimized server runtime:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | $ kc.sh build <OPTIONS>
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Start the server in production mode:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | $ kc.sh start <OPTIONS>
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Enable auto-completion to bash/zsh:
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | $ source <(kc.sh tools completion)
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Please, take a look at the documentation for more details before deploying in
Fri, Apr 15 2022 2:20:13 pm | production.
Fri, Apr 15 2022 2:20:13 pm | ย 
Fri, Apr 15 2022 2:20:13 pm | Use "kc.sh start --help" for the available options when starting the server.
Fri, Apr 15 2022 2:20:13 pm | Use "kc.sh <command> --help" for more information about other commands.

If I describe the pod I get

Name:         keycloak-5d679bc848-z42zj
Namespace:    keycloak
Priority:     0
Node:         master3/192.168.1.22
Start Time:   Fri, 15 Apr 2022 15:57:15 +0100
Labels:       app.kubernetes.io/name=keycloak
              pod-template-hash=5d679bc848
Annotations:  <none>
Status:       Running
IP:           10.42.2.146
IPs:
  IP:           10.42.2.146
Controlled By:  ReplicaSet/keycloak-5d679bc848
Containers:
  keycloak:
    Container ID:   containerd://132e4c013f388476f6770cd7c1a86a9102d034b7305db43a7194c4760f83be74
    Image:          quay.io/keycloak/keycloak:latest
    Image ID:       quay.io/keycloak/keycloak@sha256:9e7e11f0c71e6959c94bb40610f60d1b27d8a71bcfecfe8c7c714837960a6d17
    Ports:          8080/TCP, 8443/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 15 Apr 2022 15:57:37 +0100
      Finished:     Fri, 15 Apr 2022 15:57:38 +0100
    Ready:          False
    Restart Count:  2
    Readiness:      http-get http://:8080/auth/realms/master delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      DB_VENDOR:                 postgres
      DB_ADDR:                   <set to the key 'POSTGRES_HOST' in secret 'keycloak-db-secret'>      Optional: false
      DB_PORT:                   <set to the key 'POSTGRES_PORT' in secret 'keycloak-db-secret'>      Optional: false
      DB_DATABASE:               <set to the key 'POSTGRES_DATABASE' in secret 'keycloak-db-secret'>  Optional: false
      DB_USER:                   <set to the key 'POSTGRES_USERNAME' in secret 'keycloak-db-secret'>  Optional: false
      DB_PASSWORD:               <set to the key 'POSTGRES_PASSWORD' in secret 'keycloak-db-secret'>  Optional: false
      KEYCLOAK_USER:             admin
      KEYCLOAK_PASSWORD:         admin
      PROXY_ADDRESS_FORWARDING:  true
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b7tbb (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-b7tbb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  38s                default-scheduler  Successfully assigned keycloak/keycloak-5d679bc848-z42zj to master3
  Normal   Pulled     37s                kubelet            Successfully pulled image "quay.io/keycloak/keycloak:latest" in 519.760275ms
  Normal   Pulled     36s                kubelet            Successfully pulled image "quay.io/keycloak/keycloak:latest" in 523.734937ms
  Normal   Pulling    16s (x3 over 38s)  kubelet            Pulling image "quay.io/keycloak/keycloak:latest"
  Normal   Created    16s (x3 over 37s)  kubelet            Created container keycloak
  Normal   Started    16s (x3 over 37s)  kubelet            Started container keycloak
  Normal   Pulled     16s                kubelet            Successfully pulled image "quay.io/keycloak/keycloak:latest" in 624.331764ms
  Warning  Unhealthy  15s                kubelet            Readiness probe failed: Get "http://10.42.2.146:8080/auth/realms/master": dial tcp 10.42.2.146:8080: connect: connection refused
  Warning  BackOff    13s (x5 over 35s)  kubelet            Back-off restarting failed container

I am able to login to the Postgres database and list databases, but it does not look like a keycloak database was created.

unable to configure the monitoring for Crunchy

Please report any bugs or feature requests specific to the PGO Examples that are in this repository. This includes anything around the examples for Kustomize and Helm.

For any bugs or feature request related to PGO itself, please visit https://github.com/CrunchyData/postgres-operator

postgres-operator-examples-main]$ ~/kubectl apply -f kustomize/monitoring/crunchy_grafana_dashboards.yml --validate=false
error: unable to decode "kustomize/monitoring/crunchy_grafana_dashboards.yml": Object 'Kind' is missing in '{"apiVersion":1,"providers":[{"disableDeletion":false,"folder":"","name":"crunchy_dashboards","options":{"path":"$GF_PATHS_PROVISIONING/dashboards"},"orgId":1,"type":"file","updateIntervalSeconds":3}]}'

Can this be tagged?

I'm planning to use this postgres-operator in my FluxCD managed kubernetes cluster but there's no way to do it without tagging. I'm using renovatebot to track new releases and then Flux will deploy the changes into the cluster automatically. I have read the Where are the release tags for PGO v5? FAQ, but this repository is meant to be the entrypoint of us to install PGO, isn't it?

kustomize example for the spec.monitoring part is missing

I looked at the examples and put a few pieces together to set up my first postgresql cluster. When I tried to configure the monitoring part, I noticed that an example for the spec.monitoring part is missing.

Personally, I ended up checking the PostgresCluster CRD and defining it myself:

monitoring:
description: The specification of monitoring tools that connect to
PostgreSQL
properties:
pgmonitor:
description: PGMonitorSpec defines the desired state of the pgMonitor
tool suite
properties:
exporter:
properties:
configuration:
description: 'Projected volumes containing custom PostgreSQL
Exporter configuration. Currently supports the customization
of PostgreSQL Exporter queries. If a "queries.yaml"
file is detected in any volume projected using this
field, it will be loaded using the "extend.query-path"
flag: https://github.com/prometheus-community/postgres_exporter#flags
Changing the values of field causes PostgreSQL and the
exporter to restart.'
items:
description: Projection that may be projected along
with other supported volume types
properties:
configMap:
description: information about the configMap data
to project
properties:
items:
description: If unspecified, each key-value
pair in the Data field of the referenced ConfigMap
will be projected into the volume as a file
whose name is the key and content is the value.
If specified, the listed keys will be projected
into the specified paths, and unlisted keys
will not be present. If a key is specified
which is not present in the ConfigMap, the
volume setup will error unless it is marked
optional. Paths must be relative and may not
contain the '..' path or start with '..'.
items:
description: Maps a string key to a path within
a volume.
properties:
key:
description: The key to project.
type: string
mode:
description: 'Optional: mode bits used
to set permissions on this file. Must
be an octal value between 0000 and 0777
or a decimal value between 0 and 511.
YAML accepts both octal and decimal
values, JSON requires decimal values
for mode bits. If not specified, the
volume defaultMode will be used. This
might be in conflict with other options
that affect the file mode, like fsGroup,
and the result can be other mode bits
set.'
format: int32
type: integer
path:
description: The relative path of the
file to map the key to. May not be an
absolute path. May not contain the path
element '..'. May not start with the
string '..'.
type: string
required:
- key
- path
type: object
type: array
name:
description: 'Name of the referent. More info:
https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion,
kind, uid?'
type: string
optional:
description: Specify whether the ConfigMap or
its keys must be defined
type: boolean
type: object
downwardAPI:
description: information about the downwardAPI data
to project
properties:
items:
description: Items is a list of DownwardAPIVolume
file
items:
description: DownwardAPIVolumeFile represents
information to create the file containing
the pod field
properties:
fieldRef:
description: 'Required: Selects a field
of the pod: only annotations, labels,
name and namespace are supported.'
properties:
apiVersion:
description: Version of the schema
the FieldPath is written in terms
of, defaults to "v1".
type: string
fieldPath:
description: Path of the field to
select in the specified API version.
type: string
required:
- fieldPath
type: object
mode:
description: 'Optional: mode bits used
to set permissions on this file, must
be an octal value between 0000 and 0777
or a decimal value between 0 and 511.
YAML accepts both octal and decimal
values, JSON requires decimal values
for mode bits. If not specified, the
volume defaultMode will be used. This
might be in conflict with other options
that affect the file mode, like fsGroup,
and the result can be other mode bits
set.'
format: int32
type: integer
path:
description: 'Required: Path is the relative
path name of the file to be created.
Must not be absolute or contain the
''..'' path. Must be utf-8 encoded.
The first item of the relative path
must not start with ''..'''
type: string
resourceFieldRef:
description: 'Selects a resource of the
container: only resources limits and
requests (limits.cpu, limits.memory,
requests.cpu and requests.memory) are
currently supported.'
properties:
containerName:
description: 'Container name: required
for volumes, optional for env vars'
type: string
divisor:
anyOf:
- type: integer
- type: string
description: Specifies the output
format of the exposed resources,
defaults to "1"
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
resource:
description: 'Required: resource to
select'
type: string
required:
- resource
type: object
required:
- path
type: object
type: array
type: object
secret:
description: information about the secret data to
project
properties:
items:
description: If unspecified, each key-value
pair in the Data field of the referenced Secret
will be projected into the volume as a file
whose name is the key and content is the value.
If specified, the listed keys will be projected
into the specified paths, and unlisted keys
will not be present. If a key is specified
which is not present in the Secret, the volume
setup will error unless it is marked optional.
Paths must be relative and may not contain
the '..' path or start with '..'.
items:
description: Maps a string key to a path within
a volume.
properties:
key:
description: The key to project.
type: string
mode:
description: 'Optional: mode bits used
to set permissions on this file. Must
be an octal value between 0000 and 0777
or a decimal value between 0 and 511.
YAML accepts both octal and decimal
values, JSON requires decimal values
for mode bits. If not specified, the
volume defaultMode will be used. This
might be in conflict with other options
that affect the file mode, like fsGroup,
and the result can be other mode bits
set.'
format: int32
type: integer
path:
description: The relative path of the
file to map the key to. May not be an
absolute path. May not contain the path
element '..'. May not start with the
string '..'.
type: string
required:
- key
- path
type: object
type: array
name:
description: 'Name of the referent. More info:
https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion,
kind, uid?'
type: string
optional:
description: Specify whether the Secret or its
key must be defined
type: boolean
type: object
serviceAccountToken:
description: information about the serviceAccountToken
data to project
properties:
audience:
description: Audience is the intended audience
of the token. A recipient of a token must
identify itself with an identifier specified
in the audience of the token, and otherwise
should reject the token. The audience defaults
to the identifier of the apiserver.
type: string
expirationSeconds:
description: ExpirationSeconds is the requested
duration of validity of the service account
token. As the token approaches expiration,
the kubelet volume plugin will proactively
rotate the service account token. The kubelet
will start trying to rotate the token if the
token is older than 80 percent of its time
to live or if the token is older than 24 hours.Defaults
to 1 hour and must be at least 10 minutes.
format: int64
type: integer
path:
description: Path is the path relative to the
mount point of the file to project the token
into.
type: string
required:
- path
type: object
type: object
type: array
image:
description: The image name to use for crunchy-postgres-exporter
containers. The image may also be set using the RELATED_IMAGE_PGEXPORTER
environment variable.
type: string
resources:
description: 'Changing this value causes PostgreSQL and
the exporter to restart. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers'
properties:
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Limits describes the maximum amount
of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Requests describes the minimum amount
of compute resources required. If Requests is omitted
for a container, it defaults to Limits if that is
explicitly specified, otherwise to an implementation-defined
value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
type: object
type: object
type: object
type: object

I think it would be great to provide such an example to the community.

pgo v5.1.0 mTLS feature on IPv6

Hi, I have tried to deploy crunchy-postgres database on pure ipv6 kubernetes platforms v1.21.7.
It worked on 5.0.5 with SSH approach.
I did some troubleshooting and I think the reason is with (pgbackrest-server configuration)

โ”‚ pgbackrest-server.conf:
โ”‚ ---- 
โ”‚ # Generated by postgres-operator. DO NOT EDIT. 
โ”‚ # Your changes will not be saved.โ”‚                                                                                                                                                                                           โ”‚
โ”‚ [global]                                                                                                                                                                                  โ”‚
โ”‚ tls-server-address = 0.0.0.0
โ”‚ tls-server-auth = pgbackrest@b9b04d47-418c-4a58-be32-2b9cb8d54d67=*
โ”‚ tls-server-ca-file = /etc/pgbackrest/conf.d/~postgres-operator/tls-ca.crt
โ”‚ tls-server-cert-file = /etc/pgbackrest/server/server-tls.crt
โ”‚ tls-server-key-file = /etc/pgbackrest/server/server-tls.key                     

From https://pgbackrest.org/user-guide-rhel.html#repo-host/config I got "tls-server-address=*" so believe it is listen on all available interfaces.
I am not able to change it, because it is always overwrite by postgres-operator, using global variables it also does not work.

Can you please give any hint how can we change it? Have you ever tried deployment on IPv6 environment?

Thanks,
Regards

Operator deployed via Helm fails with missing PDB permissions

After deploying the postgres-operator via the Helm chart in helm/install the operator pod reports:

time="2022-01-26T12:37:41Z" level=info msg="Starting EventSource" file="sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:165" func="controller.(*Controller).Start.func1" reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster source="kind source: /, Kind=" version=5.0.4
E0126 12:37:41.306676       1 reflector.go:138] k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.PodDisruptionBudget: failed to list *v1beta1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:serviceaccount:dev:pgo" cannot list resource "poddisruptionbudgets" in API group "policy" in the namespace "dev"

cannot change permissions of pgdata/pg13 no such file postgres-startup nss-wrapper-init

Hello,

Deployed the operator with Rancher UI, added the postgres-operator-examples repo main and installed pgo and postgrescluster charts. In Rancher, in the cluster under Pods:

hippo-pgbouncer-686ff789f6-jwrxr - OK
hippo-repo-host-0 - OK
pgo-dd4776866-gl9pf - OK

However... hippo-instance1-nzpg-0 has an error: Containers with incomplete status: [postgres-startup nss-wrapper-init]
and the log is:
Initializing ...
::postgres-operator: uid::26
::postgres-operator: gid::26
::postgres-operator: postgres path::/usr/pgsql-13/bin/postgres
::postgres-operator: postgres version::postgres (PostgreSQL) 13.4
::postgres-operator: config directory::/pgdata/pg13
::postgres-operator: data directory::/pgdata/pg13
install: cannot change permissions of โ€˜/pgdata/pg13โ€™: No such file or directory

registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.4-1

Thanks

pgbouncer - adding user does not work

I'm trying to add another user to the pgbouncer config. However, when I edit the secret -pgbouncer it immediately overwrites my changes to pgbouncer-users.txt key. How am I supposed to either update the pgbouncer password or add another user to access pgbouncer?

Role label collides with metrics

postgres-operator.crunchydata.com/role label is rewritten to just role:

- source_labels: [__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_role]
target_label: role
replacement: '$1'

then it collides with some metrics, eg:

# HELP ccp_pg_stat_statements_total_calls_count Total number of queries run per user/database
# TYPE ccp_pg_stat_statements_total_calls_count gauge
ccp_pg_stat_statements_total_calls_count{dbname="hippo-ha",role="_crunchypgbouncer",server="localhost:5432"} 10
ccp_pg_stat_statements_total_calls_count{dbname="hippo-ha",role="hippo-ha",server="localhost:5432"} 45
ccp_pg_stat_statements_total_calls_count{dbname="hippo-ha",role="postgres",server="localhost:5432"} 2

It would end up rewriting role from the metric to the role from labels.

FR ability to set storage class and affinities

Hi,

It would be great to have possibility to set storage class in helm-chart values and node affinities/tolerations.

I see CRD supports affinities/tolerations. Maybe I can create MR with this changes for helm chart?

Invalid XML error message

Hi there!

I was going through your tutorial, which is very well done btw, and ran into some issues when trying out the S3 backup (Non-AWS). The connection works and stuff arrives in the bucket, but I have errors in the operator logs and when using kubectl describe postgrescluster <NAME> .

From kubectl describe:

Warning  UnableToCreateStanzas  2m50s (x54 over 12m)  postgrescluster-controller  command terminated with exit code 29: ERROR: [029]: invalid xml

The operator logs say:

time="2021-09-29T14:38:52Z" level=debug msg=Warning file="sigs.k8s.io/[email protected]/pkg/internal/recorder/recorder.go:98" func="recorder.(*Provider).getBroadcaster.func1.1" message="command terminated with exit code 29: ERROR: [029]: invalid xml\n" object="{PostgresCluster postgres-operator hippo-s3 cbe2bd5a-e4f2-4240-b4f0-7314d1aac1bb postgres-operator.crunchydata.com/v1beta1 82678 }" reason=UnableToCreateStanzas version=5.0.2-0

I realized later that version 5.0.2, which is used by default, is not yet publicly released. But this might help us both in the long term, anyway.

If you need more info I'll be happy to help.

Postgres chart is not installable by default

When installing the postgres chart with no added values, it fails:

helm install example postgres
Error: INSTALLATION FAILED: template: postgrescluster/templates/postgres.yaml:193:14: executing "postgrescluster/templates/postgres.yaml" at <eq .Values.openshift false>: error calling eq
: incompatible types for comparison

Manually setting with --set openshift=false fixes it and it installs fine. However it should probably just work without it. Uncommenting the openshift value in helm/postgres/values.yaml should fix it

My Helm version:

helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}

How to find a list of available docker images for pgbouncer/posgres and so on?

I found https://hub.docker.com/r/crunchydata/crunchy-postgres but this does only offer 13.5 images.

Since i will need to update the images from time to time, how to finde the newest available images for

  • registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-12*
  • registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer

Also it sounds wrong to use registry.developers.crunchydata.com for production use - are those the actual production images?

I found docs like https://access.crunchydata.com/documentation/postgres-operator/5.0.4/tutorial/update-cluster/ but still no hint on the actual image index for the available tags?

Thanks!

standby cluster with s3 bucket v5.0.5

Hi
I am trying to create standby cluster with using s3 bucket.
It works only when I first did a backup on active db cluster with s3 repo.
According to documentation it should take archive logs from s3 and run as standby cluster.

It seems that pgbackrest first try do a restore.
I am using helm chart and here is a values.yaml

 name: db-cluster
 instances:
   - name: database
     replicas: 2
     dataVolumeClaimSpec:
       accessModes:
       - "ReadWriteOnce"
       resources:
         requests:
           storage: 2Gi
       storageClassName: "rook-cephfs"
     walVolumeClaimSpec:
       accessModes:
       - "ReadWriteOnce"
       resources:
         requests:
           storage: 1Gi
       storageClassName: "rook-cephfs"
     affinity:
       podAntiAffinity:
         preferredDuringSchedulingIgnoredDuringExecution:
         - weight: 1
           podAffinityTerm:
             topologyKey: kubernetes.io/hostname
             labelSelector:
               matchLabels:
                 postgres-operator.crunchydata.com/cluster: db-cluster
                 postgres-operator.crunchydata.com/instance-set: database

 patroni:
   dynamicConfiguration:
     synchronous_mode: true
     postgresql:
       parameters:
         synchronous_commit: "on"
       pg_hba:
         - "local all postgres peer"
         - "local all postgres md5"
         - "local all crunchyadm peer"
         - "host all postgres 0.0.0.0/0 md5"
         - "host replication primaryuser 0.0.0.0/0 md5"
 standby:
   enabled: true
   repoName: repo2
 pgBackRestConfig:
   repos:
   - name: repo1
     volume:
       volumeClaimSpec:
         accessModes:
         - "ReadWriteOnce"
         resources:
           requests:
             storage: 1Gi
     schedules:
       full: "0 1 * * *"
       incremental: "0 */4 * * *"
   - name: repo2
     s3:
       bucket: "S3BUCKET_NAME"
       endpoint: "S3BUCKET_ENDPOINT"
       region: "S3BUCKET_REGION"
     schedules:
       full: "0 1 * * *"
       incremental: "0 */4 * * *"
   manual:
     repoName: repo1
     options:
      - --type=full
   global:
     repo1-retention-full: "14"
     repo1-retention-full-type: time
     repo2-path: "/pgbackrest/db-cluster/repo2"
     repo2-type: "s3"
     repo2-s3-key: "S3_KEY"
     repo2-s3-key-secret: "S3_SECRET"
     repo2-s3-verify-tls: "n"
     repo2-storage-port: "S3_PORT"
     repo2-s3-uri-style: "path"
     repo2-retention-full: "14"
     repo2-retention-full-type: time
     log-level-console: "debug"
     log-level-file: "debug"
     start-fast: "y"

This is what I get from logs:

โ”‚ database WARN: repo2: [BackupSetInvalidError] no backup sets to restore                                                                                                                               โ”‚
โ”‚ database ERROR: [075]: no backup set found to restore                                                                                                                                                 โ”‚
โ”‚ database 2022-04-13 12:37:27,167 ERROR: Error creating replica using method pgbackrest: 'bash' '-ceu' '--' 'install --directory --mode=0700 "${PGDATA?}" && exec "$@"' '-' 'pgbackrest' 'restore' '-- โ”‚
โ”‚ database 2022-04-13 12:37:27,167 ERROR: failed to bootstrap clone from remote master None

Thanks in advance,
Regards,
Grzegorz

Latest image pull fails crunchy-postgres-ha:centos8-13.4-0

Latest image pull fails crunchy-postgres-ha:centos8-13.4-0
Error:

Failed to pull image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.4-0": rpc error: code = Unknown desc = Error response from daemon: unknown: This container version is no longer available from the Crunchy Data Developer Program. For information on accessing these containers, please contact [email protected].

Missing Helm value to disable debug logging

As of now there is no option to disable the debug logging level within the Helm Chart, it default to true.

I would be happy if we can disable it by an option in the Helm Chart values file.

- name: CRUNCHY_DEBUG
value: "true"

The environment variable value is being read by the Operator here:

https://github.com/CrunchyData/postgres-operator/blob/7bb5d719dc5035a4bda2d8261503dadea83d77d7/cmd/postgres-operator/main.go#L44-L51

What do you think about it? What option shall enable/disable debug logging?

Postgresql Custom image

Hello
I want to use custom postgresql image. For instance "supabase/postgres"
For it I rebuild image and prepared postgres user id (26) by dockerfile like your image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.4-1".

Also after start image I get the error: "Error: container has runAsNonRoot and image has non-numeric user (postgres), cannot verify user is non-root"

What I need to do for starting?

Fix helm image.repository to include image name

Fix helm image.repository to include image name to be consistent with relatedImages.*.repository which already includes image name. This change also allows the postgres-operator image name to be overridden which is a secondary benefit.

Drop Container Capabilities

As far as I see there is no reason to keep all the capabilities and we can drop all of them in Container Security Context:

  capabilities:
    drop:
    - ALL

What do you think about? Please tell me if there is a specific need, otherwise I am happy to create a PR as a follow up of #55

pgbackrest + s3 minio Name or service not known

Version: 5.0.4-0

Hi,

When I try to use a private minio the format of the address is incorrect.

I have the following configuration:

backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-0
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/dbname/repo1
repo1-retention-full: "14"
repo1-retention-full-type: time
repos:
- name: repo1
s3:
endpoint: "minio.example.com"
bucket: "postgres"
region: "whatever"
schedules:
full: "0 10 * * *"
incremental: "0 * * * *"

But in the log of PGO :
level=error msg="unable to create stanza" error="command terminated with exit code 49: ERROR: [049]: unable to get address for 'postgres.minio.example.com': [-2] Name or service not known\n" file="internal/controller/postgrescluster/pgbackrest.go:2320" func="postgrescluster.(*Reconciler).reconcileStanzaCreate" name=dbname namespace=postgres-operator reconciler=pgBackRest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.0.4-0

For what reason the format is not correct ?

Thank you very much

s3 backup job is not triggering

I am trying to take backup of postgres database using Crunchydata postgres-operator version 5.0.1, in a bare metal s3 minIO bucket. When I run my pgbackrest using persistent volume claims the cluster runs a job to take back up after the cluster is up and running, but when I try to take backup using s3 bucket the job does not get triggered. In my kustomize folder I have the following

# postgres.yml
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: hippo
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-1
  postgresVersion: 13
  instances:
    - name: instance1
      walVolumeClaimSpec:
        accessModes:
        - "ReadWriteMany"
        resources:
          requests:
            storage: 1Gi
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: managed-nfs-storage
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 1Gi
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-1
      repoHost:
        dedicated: {}        
      configuration:
      - secret:
          name: pgo-s3-creds
      global:
        repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
      repos:
      - name: repo1
        s3:
          bucket: "pgbackup"
          endpoint: "192.168.10.222:9000"
          region: "None"
      manual:
        repoName: repo1
        options:
         - --type= full
  patroni:
    dynamicConfiguration:
      postgresql:
        parameters:
          max_parallel_workers: 2
          max_worker_processes: 2
          shared_buffers: 1GB
          work_mem: 2MB
        log:
          dir: /pgdata/pglog/
        pg_hba:
        - host all all 0.0.0.0/0 md5
        - host replication rep_user 0.0.0.0/0 md5
        - host replication rep_user 127.0.0.1/32 md5

in s3.conf:

[global]
#repo1-s3-key=SSKCGIL8403FN3QX9TQG
#repo1-s3-key-secret=9hY5cm8aTg27LIJ3m8JOJME8SNUYLLC3GRMAyA2A
repo1-s3-key=minio
repo1-s3-key-secret=miniostorage

and in kustomize.yaml:

namespace: postgres-operator

secretGenerator:
- name: pgo-s3-creds
  files:
  - s3.conf
generatorOptions:
  disableNameSuffixHash: true
resources:
- postgres.yaml

and # kubectl -n postgres-operator describe pod hippo-repo-host-0 returns this:

Name:         hippo-repo-host-0
Namespace:    postgres-operator
Priority:     0
Node:         dfsworker1/192.168.10.113
Start Time:   Sun, 29 Aug 2021 09:23:40 +0000
Labels:       controller-revision-hash=hippo-repo-host-66f7bcd589
              postgres-operator.crunchydata.com/cluster=hippo
              postgres-operator.crunchydata.com/pgbackrest=
              postgres-operator.crunchydata.com/pgbackrest-dedicated=
              postgres-operator.crunchydata.com/pgbackrest-host=
              statefulset.kubernetes.io/pod-name=hippo-repo-host-0
Annotations:  cni.projectcalico.org/containerID: a74bf3e4affdbf527bc2c568a0a9534074d7c82fc77ca1f13c7b940ff717ba40
              cni.projectcalico.org/podIP: 192.168.78.134/32
              cni.projectcalico.org/podIPs: 192.168.78.134/32
Status:       Running
IP:           192.168.78.134
IPs:
  IP:           192.168.78.134
Controlled By:  StatefulSet/hippo-repo-host
Init Containers:
  nss-wrapper-init:
    Container ID:  docker://433e906448b50b3e709caa58aa530854baccbb9b55839b2cdb80e9db515e255d
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-1
    Image ID:      docker-pullable://registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest@sha256:c5ae9cd667a1b814e2154d2135b9af614f96ce242b566eb858ad7177b94eecc8
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c
      NSS_WRAPPER_SUBDIR=postgres CRUNCHY_NSS_USERNAME=postgres CRUNCHY_NSS_USER_DESC="postgres" /opt/crunchy/bin/nss_wrapper.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 29 Aug 2021 09:23:43 +0000
      Finished:     Sun, 29 Aug 2021 09:23:43 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cxjt5 (ro)
Containers:
  pgbackrest:
    Container ID:  docker://42c780965284a65acb8733a92a4f22cdb70df76e4c9a000bee405896dc06f07e
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-1
    Image ID:      docker-pullable://registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest@sha256:c5ae9cd667a1b814e2154d2135b9af614f96ce242b566eb858ad7177b94eecc8
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/sbin/sshd
      -D
      -e
    State:          Running
      Started:      Sun, 29 Aug 2021 09:23:45 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       tcp-socket :2022 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      LD_PRELOAD:          /usr/lib64/libnss_wrapper.so
      NSS_WRAPPER_PASSWD:  /tmp/nss_wrapper/postgres/passwd
      NSS_WRAPPER_GROUP:   /tmp/nss_wrapper/postgres/group
    Mounts:
      /etc/pgbackrest/conf.d from pgbackrest-config (rw)
      /etc/ssh from ssh (ro)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cxjt5 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  ssh:
    Type:                Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:       hippo-ssh-config
    ConfigMapOptional:   <nil>
    SecretName:          hippo-ssh
    SecretOptionalName:  <nil>
  pgbackrest-config:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          pgo-s3-creds
    SecretOptionalName:  <nil>
    ConfigMapName:       hippo-pgbackrest-config
    ConfigMapOptional:   <nil>
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  16Mi
  kube-api-access-cxjt5:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age   From               Message
  ----     ------       ----  ----               -------
  Normal   Scheduled    65s   default-scheduler  Successfully assigned postgres-operator/hippo-repo-host-0 to dfsworker1
  Warning  FailedMount  65s   kubelet            MountVolume.SetUp failed for volume "ssh" : secret "hippo-ssh" not found
  Normal   Pulled       63s   kubelet            Container image "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-1" already present on machine
  Normal   Created      62s   kubelet            Created container nss-wrapper-init
  Normal   Started      62s   kubelet            Started container nss-wrapper-init
  Normal   Pulled       61s   kubelet            Container image "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-1" already present on machine
  Normal   Created      60s   kubelet            Created container pgbackrest
  Normal   Started      60s   kubelet            Started container pgbackrest

Additionally,

# kubectl -n postgres-operator logs hippo-repo-host-0
Server listening on 0.0.0.0 port 2022.
Server listening on :: port 2022.
kex_exchange_identification: Connection closed by remote host
kex_exchange_identification: Connection closed by remote host
Accepted publickey for postgres from 192.168.122.18 port 39112 ssh2: ECDSA SHA256:f7En+B4/XFxR3Eu0ZqtPNIahYaVtxIn/LkmclRJqpho
kex_exchange_identification: Connection closed by remote host
kex_exchange_identification: Connection closed by remote host
Accepted publickey for postgres from 192.168.122.19 port 53796 ssh2: ECDSA SHA256:f7En+B4/XFxR3Eu0ZqtPNIahYaVtxIn/LkmclRJqpho
kex_exchange_identification: Connection closed by remote host

# kubectl -n postgres-operator get pods returns:

NAME                                      READY   STATUS    RESTARTS   AGE
hippo-instance1-ggst-0                    3/3     Running   0          20m
hippo-instance1-j67m-0                    3/3     Running   0          20m
hippo-repo-host-0                         1/1     Running   0          20m
nfs-client-provisioner-7d485f5b8d-l88ts   1/1     Running   0          3h9m
pgo-54ff6dc755-ct4bl                      1/1     Running   0          3h7m

Is there anyone who can tell me if I am misconfiguring anything regarding minio configurations? my kubernetes version is 1.22

Error on example with 2-Replica Cluster - Bootstrap from leader - Could not translate hostname

Hi,

We are evaluating this operator, and for that I'm trying the PGO examples, and I tried to understand how your HA operator works, and I added a configuration with 2 replicas.

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: hippo
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.4-1
  postgresVersion: 13
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: 1Gi
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.35-0
      repos:
        - name: repo1
          volume:
            volumeClaimSpec:
              accessModes:
                - "ReadWriteOnce"
              resources:
                requests:
                  storage: 1Gi

.. and followed the indications to verify self-healing as shown in Tutorial->High Availability.

By running:

kubectl -n postgres-operator get pods \
  --selector=postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/instance-set

I get this:

NAME                     READY   STATUS    RESTARTS   AGE
hippo-instance1-ksbq-0   2/3     Running   0          97m
hippo-instance1-tdxx-0   3/3     Running   0          96m

JIC, the PRIMARY POD is hippo-instance1-tdxx

My assumption is (maybe I'm completely wrong), that with replicas=2 we have an HA setup with 1 Primary / 1 Slave

Thus:
hippo-instance1-tdxx - Primary
hippo-instance1-ksbq - Slave

Before continuing with the 2 tests mentioned in the tutorial I tried to understand the secondary pod status.. In order to understand this, I tried opening a terminal in each pod, and in primary pod I can enter and launch PSQL:

bash-4.4$ psql
psql (13.4)
Type "help" for help.

postgres=# SELECT NOT pg_catalog.pg_is_in_recovery() is_primary;
 is_primary 
------------
 t
(1 row)

And once I do the same in the secondary pod, I get this:

bash-4.4$ psql
psql: error: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/postgres/.s.PGSQL.5432"?
bash-4.4$ 

Thus seems -perhaps- we have some issue in the secondary pod... by analyzing pod's logs I found something strange in secondary pod:

Primary POD hippo-instance1-tdxx:

...
2021-10-27 15:36:10,787 INFO: no action. I am (hippo-instance1-tdxx-0) the leader with the lock
2021-10-27 15:36:21,017 INFO: no action. I am (hippo-instance1-tdxx-0) the leader with the lock
2021-10-27 15:36:30,785 INFO: no action. I am (hippo-instance1-tdxx-0) the leader with the lock
2021-10-27 15:36:40,787 INFO: no action. I am (hippo-instance1-tdxx-0) the leader with the lock
...

Secondary POD hippo-instance1-ksbq:

...
2021-10-27 15:38:20,836 ERROR: Error when fetching backup: pg_basebackup exited with code=1
2021-10-27 15:38:20,836 WARNING: Trying again in 5 seconds
2021-10-27 15:38:30,780 INFO: Lock owner: hippo-instance1-tdxx-0; I am hippo-instance1-ksbq-0
2021-10-27 15:38:30,780 INFO: bootstrap from leader 'hippo-instance1-tdxx-0' in progress
2021-10-27 15:38:40,813 INFO: Lock owner: hippo-instance1-tdxx-0; I am hippo-instance1-ksbq-0
2021-10-27 15:38:40,813 INFO: bootstrap from leader 'hippo-instance1-tdxx-0' in progress
pg_basebackup: error: could not translate host name "hippo-instance1-tdxx-0.hippo-pods" to address: Name or service not known
2021-10-27 15:38:45,857 ERROR: Error when fetching backup: pg_basebackup exited with code=1
2021-10-27 15:38:45,857 ERROR: failed to bootstrap from leader 'hippo-instance1-tdxx-0'
2021-10-27 15:38:45,857 INFO: Removing data directory: /pgdata/pg13
2021-10-27 15:38:50,781 INFO: Lock owner: hippo-instance1-tdxx-0; I am hippo-instance1-ksbq-0
2021-10-27 15:38:51,024 INFO: trying to bootstrap from leader 'hippo-instance1-tdxx-0'
2021-10-27 15:39:00,779 INFO: Lock owner: hippo-instance1-tdxx-0; I am hippo-instance1-ksbq-0
2021-10-27 15:39:00,852 INFO: bootstrap from leader 'hippo-instance1-tdxx-0' in progress
2021-10-27 15:39:10,810 INFO: Lock owner: hippo-instance1-tdxx-0; I am hippo-instance1-ksbq-0
2021-10-27 15:39:10,811 INFO: bootstrap from leader 'hippo-instance1-tdxx-0' in progress
pg_basebackup: error: could not translate host name "hippo-instance1-tdxx-0.hippo-pods" to address: Name or service not known
2021-10-27 15:39:11,054 ERROR: Error when fetching backup: pg_basebackup exited with code=1
2021-10-27 15:39:11,054 WARNING: Trying again in 5 seconds
...

Thus seems error is related to pg_basebackup (seems is a tool to take base backups of a running database cluster), as it cannot reach the primary host?!:

pg_basebackup: error: could not translate host name "hippo-instance1-tdxx-0.hippo-pods" to address: Name or service not known

In my case I'm using OKE on Oracle's cloud OCI, which is
CNCF-certified. I also tried this with Microk8s and have a similar behavior (slave pod with same error log),

Thus, I'd like to ask:

  1. Are my assumptions ok (with replicas=2 I have 1 Primary/Master and 1 Secondary/Slave, and both should be available?)
  2. In ideal conditions, should the secondary POD hippo-instance1-ksbq-0 report 3/3 in Ready status?
  3. In ideal conditions, should both Primary and Secondary PODs respond to the a PSQL query?
  4. Based on the logs, I assume there's something with the networking.. Is there any additional steps I might be missing, or any advice to troubleshoot this issue? FYI, I just followed the steps as described in tutorial.

Thanks in advance for your help.

Best

Javier

Allow Helm CRD management to be disabled

It is common for mainstream CRD-based workloads to allow Helm CRD management to be externalized from the main chart. One approach used by Rancher, the Elastic Cloud on Kubernetes Operator, and others externalizes crds into their own chart, for example, charts eck-operator and eck-operator-crds.

Another lighter-weight approach adopted by jetstack/cert-manager and kyverno/kyverno to name a few leverages an "installCRDs" boolean property to allow Helm CRD management to be disabled. This approach at least provides the option to disable CRD management.

The current approach used by this chart that relies on the "/crds" directory to install crds but not update them is problematic, especially for gitops scenarios. Consider adopting one of the other strategies described above.

Add an example for the multi-cluster deployment

Hi,

I am currently struggling to get my existing data into my brand new, shiny, operator managed, cluster. After finding this answer, I would like to try it for myself. Unfortunately the doc only uses the pgo client. Since I manage all my manifests through kustomize, a minimal example that I could use as a starting point would be very helpful.

postgres replica crashs after a successful failover

I have crunchy postgres operator running on a kubernetes cluster with 3 worker nodes deployed using kubespray (bare metal), I have setup one replica to switch on when the primary is down.
the state of the replica was running and synced with postgres master with no lag, for test reasons, I have stopped the node which the master postgres is running on it, the failover to the replica was done, and postgres become availbale after a moment.

When I restart up the stopped node, the postgres instance on it become crashed and the lag details become unknown:

Every 2.0s: patronictl list                                                                                                                                                        
+---------------------------+-----------------------------------------+---------+---------+----+-----------+
| Member                    | Host                                    | Role    | State   | TL | Lag in MB |
+ Cluster: pg-metal-ha (7075323376834977860) -------------------------+---------+---------+----+-----------+
| pg-metal-instance1-hfdp-0 | pg-metal-instance1-hfdp-0.pg-metal-pods | Replica | running |    |   unknown |
| pg-metal-instance1-zdc6-0 | pg-metal-instance1-zdc6-0.pg-metal-pods | Leader  | running |  2 |           |
+---------------------------+-----------------------------------------+---------+---------+----+-----------+

the log of the crashed instance pod is:

psycopg2.OperationalError: FATAL:  index "pg_database_oid_index" contains unexpected zero page at block 0
HINT:  Please REINDEX it.

The hint didn't worked, I can't reindex the index "pg_database_oid_index" using psql, and this is th output of psql command:

bash-4.4$ psql
psql: error: FATAL:  index "pg_database_oid_index" contains unexpected zero page at block 0
HINT:  Please REINDEX it.

I redo the failover test many times with newly created postgres clusters, and I got the same result. is this a bug in crunchy-postgres-operator?

k8s version:

# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:10:45Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:04:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}

postgres.yaml :

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: pg-metal
  namespace: prj-metal

spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:centos8-13.6-3.0-0
  postgresVersion: 13
  users:
    - name: pg
      options: "SUPERUSER"
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: "ins-ls"
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 75Gi
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - topologyKey: kubernetes.io/hostname
            labelSelector:
              matchLabels:
                postgres-operator.crunchydata.com/cluster: pg-metal
                postgres-operator.crunchydata.com/instance-set: instance1

Documentation for s3 pgbackrest with Helm

Hello,

I have two issues:

  1. Can I get an example of setting up pgbackrest repo with s3 buckets using Helm instead of kustomize? Helm has worked better in general.

  2. I could not get the kustomize s3 example to work. The secret does not get created.

postgres-startup init container is broken

Postgres startup container is broken.

install: cannot change permissions of โ€˜/pgdata/pg13โ€™: No such file or directory

I know this goes to postgres-operator

After applying the sample cluster the node failed to start caused by postgres-startup initContainer.

GCS configuration in postgres helm-chart is broken (base64 encoding)

Hi,

If i try to create a cluster with postgres helm-chart, add my gcs json key and spin up the cluster it does not finish creating since pgo is not decoding base64 secret.
I have worked around the issue by removing base64 encoding in secret template in gcs json fields. helm/postgres/templates/pgbackrest-secret.yaml

Changing:

        {{ $repo.gcs.key | b64enc }}

to

        {{ $repo.gcs.key }}

In kustomize i see it is passed without encoding, so should work fine.
might also be the case in aws/azure storage.

Example for LB service

Hi,

I saw in the operator project that support for providing pgbouncer or ha as LoadBalancer service has been added.
Just wanted to ask if there is already any kind of docs for it that explain how to apply it?
We would be very interested in that functionality.

Best
Patrick

How to pass args to initdb?

[Edit: My apologies, I realize now this belongs on the main postgres-operator issues board and have posted it there instead]

Hello, I need a database with initdb args --lc-collate=C and --lc-ctype=C for Matrix Synapse, and it's not entirely clear how to do so from the docs. Nothing I see in the CRD reference looks right -- PostgresCluster.spec.patroni only appears to cover dynamic config -- and I searched many forum & issue posts. Is there a way to get these arguments to initdb with a v5 PostgresCluster?

Missing option to override Image Registry

If you want to use a proxy like a pull through cache it is easier to override the registry itself and leave the repository name as it is.

As of now, we need to override the whole repository to let us able to use custom registry which is fragile due to the fact that repository name can be changed.

On the other hand, I see this convention can be found in the huge Bitnami Helm Chart collection

Helm Chart doesn't contain schedule for backups?

Thank you for the latest PGO, it's amazing, and fully Kubernetes, no more command line pgo:)

I'm hoping it's me, but I don't see anywhere in the HELM values.yaml file for the backup schedule, did I just not see it?

I'm hoping I can add to the template, but was hoping smarter HELM people than me would

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.