Now that I have things installed, here's my history trying to create a cluster: <d

To clarify the second question you had, according to the <a href="https://github.com/C

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Clarify that configuration of pgo cli is required, not optional,about crunchydata/postgres-operator

Comments (10)

jmccormick2001 commented on May 13, 2024

thanks, will do.

from postgres-operator.

xenophenes commented on May 13, 2024

To clarify the second question you had, according to the Configuration Document:

You will need to copy a sample to your $HOME/.pgo.yaml location. It
can be defined in the following locations:

. (current directory)

$HOME

/etc/pgo/

The .pgo.yaml file location is checked in that order.

Hopefully that helps. We will ensure that we clarify both of those sections of the documentation. Thanks for bringing it to our attention!

from postgres-operator.

jar349 commented on May 13, 2024

Thanks! I'm sorry to say that I was never able to get the operator cli to work.

from postgres-operator.

xenophenes commented on May 13, 2024

May I ask what happened? Are you still interested in getting this to work? I'm happy to help wherever possible in getting it configured.

from postgres-operator.

jar349 commented on May 13, 2024

Sure, I'm interested: I think that being able to stand up a cluster on demand like this opens a particularly compelling deployment story that negates the need for expand/contract of schemas... but I simply ran out of R&D time getting it to work.

For whatever reason, the pre-built pgo binary didn't work (I no longer recall why) and then I was unclear on the build-from-source instructions (which have since been clarified), and then I didn't have time to go through the configuration file and figure out what the correct settings are.

For example, I use the AWS EBS StorageClass. I don't know much more about PVCs beyond what I read in the kubernetes overview docs, so I don't know what a PVC lspvc template is. I wasn't sure what to use for STORAGE_TYPE... or why I would ever want to use ReadWriteOnce for PVC_ACCESS_MODE.

Even then, I didn't see a place in the config file to set the 'default' namespace to assume and my desire to type --namespace default every single time was, and remains, zero.

In addition, I want to set some of these things dynamically as part of my CI/CD pipeline - like usernames and passwords. It's difficult for me to know how much storage I want at CI/CD time... I basically want how much the previously deployed cluster needed, plus around 20%. Checking in a PGO config per service isn't awful, but I don't want to have to hand-update it before each checkin to the integration branch (where CI/CD watching for changes).

None of these things were show stoppers, and I suspect that most of the above list is super-simple to 'fix' if I just knew pgo better. It's just that I saw more time-investment figuring all this out as opposed to calling AWS RDS and getting a db that way. In other words, mostly my own fault and ignorance.

I think that making some sane assumptions would greatly simplify pgo: default as the namespace. Automatically looking in ~/.kube/ for config... or other known locations. Asking the kubernetes API server for the list of storage classes available and prompting the user to choose one if none is configured.

Additionally, heading down the path of a CLI flag-driven configuration rather than conf-file configuration would be much more script-friendly for CI/CD pipelines.

I offer this feedback with no ill feelings and in the understanding that half of these things - or more - are due to my own ignorance rather than a real lack in the product. I think that the idea of kubernetes operators is killer...

from postgres-operator.

jar349 commented on May 13, 2024

I recently got postgres-operator installed on my cluster running in AWS. I decided to add my notes here because it's a better answer to your question than my rambling paragraph above.

notes from trying to install postgres-operator

I have an HA cluster (master nodes across AZs) on AWS that was created by KOPS and I want to use postgres-operator. These are my notes of the process.

I'm reading the build doc. I'm in the Overview section. I don't want to build the images from source, so I intend to use the docker images for the operator and the pre-built binary for the client. At this point, I don't know where in the documentation to look because the text that follows is about building from source. So I just go through it, section by section.

The Kubernetes Environment section looks not applicable because I already have a production environment on AWS. But I do notice that it mentions something important about RBAC. When I deploy the postgres-operator, I will make sure to also create a postgres-operator service account, upadate the deployment to specify the service account, and then create the following ClusterRoleBinding:

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: postgres-operator-binding
subjects:
- kind: ServiceAccount
  name: postgres-operator
  namespace: default
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

The Openshift Origin Environment section doesn't look applicable to me, so I'm skipping it.

The Create Project and Clone section doesn't look applicable because I'm not building anything from source. Skip.

The Get Packaged Dependencies section is applicable because I don't want to build from source. I don't understand why I'd want to pull the docker image to my local machine. I assume the docs do that because the author is testing against the VM they mention earlier and so I don't pull the docker images, but I do download and untar the pgo client and add pgo to my PATH

The Build from Source section doesn't look applicable because I'm not building from source. I skip it.

The Deploy the PostgreSQL Operator section looks like the right place to jump to after reading the Overview section and deciding that you don't want to build from source. The note says that it will create and use /data on my local system for the operator that's in kubernetes. I don't want to do that on my prod cluster: I want to use EBS in AWS. There's no example for that, but I do see examples for NFS (examples/operator/create-pv-nfs.sh) that I can probably turn into EBS, like this:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: crunchy-pv0
  labels:
    type: aws-ebs
spec:
  storageClassName: gp2
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: "Retain"

Note that, for me, gp2 is created on my behalf by KOPS and uses the built-in kubernetes aws-ebs provisioner. I think that gp2 is spindle drives, so I'll one day want to look into how to create a storage class that uses EBS and SSD...

One problem is that the nfs example uses a variable called $COUNTER, which isn't documented. It's clear that it's means to be some kind of number, but I don't know where it should be set, or who increments it. For now, I just replace it with a hard-coded zero.

Moving past the Note in the Deploy section, the next thing is to deploy the operator via deploy.sh. Inspecting the shell script, I see that it creates a config map and then deploys the operator. Reading through the manifest templates makes me understand, finally, who is using the configuration file referenced in the config doc. I worry for a minute about how I'm going to give the operator a reference to KUBECONFIG from where I'll be deploying it on a k8s node, but then I search through the configmaps and realize that the operator never uses it and that it must be configuration for the client.

Then I go back to the config doc and realize that the server doesn't get that config file - that it's for the client. But hang on, because the template files I'm placing into config maps for the server operator reference variables in the pgo.yaml file. And the main README.md says that I can configure both the client and the server, so I'm confused. How is the server operator getting those values if the pgo.yaml is a client configuration file? Is the pgo cli sending all those values to the server? I don't know. I'm going to assume that for now.

The config doc says: That script (deploy.sh) will also create a PV named crunchy-pv using a HostPath of /data and a PVC named crunchy-pvc on your local system. This is only for testing the operator, edit or remove the PV/PVC creation to meet your requirements. But in looking at the deploy.sh script, it no longer does that (looking at source for tag 1.5.1). Probably just need to update the config doc. But it does imply that I need to create the PV manually before I try to deploy the server operator. I sort of figured that from the notes above. Later edit: I ended up configuring my client to use dynamic storage, so I never did create this PV.

Now I'm looking at the deployment file for the operator.

This is the place to add a reference to the postgres-operator service account.
I see that $CO_IMAGE_TAG is a piece of configuration in the yaml file, but I don't understand how the shell script is replacing the variable with the configuration. OH! That's why you're using envsubst -- they're expected to be environment variables! Ok, that's no problem, but they're in a yaml file, not an environment variable. Where's the documentation that shows me how you're creating these values as environment variables? For now, I'll just export CO_IMAGE_TAG=centos7-1.5.1 before I run the deploy script. Later Edit: I ended up hard-coding this because I believe it should come with the release tarball, described below.
a NAMESPACE environment variable for the container comes from metadata.namespace, which doesn't exist. I'll add that to the deployment metadata
I see that we're mounting the config map operator-conf we previously created at /operator-conf. Cool.

I'm starting to understand how this whole thing comes together. I wish that there weren't a deployment script but just a set of .yaml/.json files in a directory that I can feed to a single kubectl create -f <directory> command. That directory of manifest files should be included in the release tar.gz. That way, I download/extract the release, run the kubectl command to deploy the operator, edit the pgo.yaml config file for the client, and then start using the cli... all without needing to clone the repository.

Because I deploy a new cluster for each version of each service, I want the passwords to last forever. postgres allows this with VALID UNTIL 'infinity' but the pgo.yaml needs a number of days. Can we change that?

I really don't understand what the lspvc template/pod is for - would like some documentation on that. Also, kubernetes has deprecated the creation of a stand-alone Pod and now recommends that pods be deployed into kubernetes via a Deployment.

If you elect to use dynamic storage, there's no need to create the PersistantVolume object in kubernetes. The documentation should say something to that effect.

the logs of the postgres-operator say that they have created pg-cluster.crunchydata.com, pg-backup.crunchydata.com, pg-upgrade.crunchydata.com, pg-policy, pg-clone, and pg-policy-log but then it goes on to say that it can't find the requested resources (they don't have dashes in their names):

E0920 18:20:55.418172       1 reflector.go:201] github.com/crunchydata/postgres-operator/operator/backup/backup.go:100: Failed to list *tpr.PgBackup: the server could not find the requested resource (get pgbackups.crunchydata.com)
E0920 18:20:55.418308       1 reflector.go:201] github.com/crunchydata/postgres-operator/operator/cluster/cluster.go:120: Failed to list *tpr.PgCluster: the server could not find the requested resource (get pgclusters.crunchydata.com)
E0920 18:20:55.425411       1 reflector.go:201] github.com/crunchydata/postgres-operator/operator/cluster/clone.go:73: Failed to list *tpr.PgClone: the server could not find the requested resource (get pgclones.crunchydata.com)
E0920 18:20:55.443308       1 reflector.go:201] github.com/crunchydata/postgres-operator/operator/cluster/policies.go:163: Failed to list *tpr.PgPolicylog: the server could not find the requested resource (get pgpolicylogs.crunchydata.com)
E0920 18:20:55.443419       1 reflector.go:201] github.com/crunchydata/postgres-operator/operator/upgrade/upgrade.go:73: Failed to list *tpr.PgUpgrade: the server could not find the requested resource (get pgupgrades.crunchydata.com)

the examples/tpr folder has a pg-database tpr that isn't created by default by the postgres-operator container. Should it be? OR is it meant for creating a single database instance for testing purposes?

30 minutes after deploying the postgres-operator, I see the following in the logs:

Sep 20 14:57:02 postgres-operator-3770555364-5n3sl postgres-operator error time="2017-09-20T18:57:02Z" level=error msg="error in major upgrade watch closed before Until timeout" 
Sep 20 14:57:59 postgres-operator-3770555364-5n3sl postgres-operator error time="2017-09-20T18:57:59Z" level=error msg="error in ProcessJobs watch closed before Until timeout" 
Sep 20 15:10:23 postgres-operator-3770555364-5n3sl postgres-operator error time="2017-09-20T19:10:23Z" level=error msg="error in ProcessPolicies watch closed before Until timeout" 
Sep 20 15:19:03 postgres-operator-3770555364-5n3sl postgres-operator error time="2017-09-20T19:19:03Z" level=error msg="erro in clone complete watch closed before Until timeout"

the pgo client library does not run on mac osx. any plans on creating a pre-built pgo executable for osx?

from postgres-operator.

jmccormick2001 commented on May 13, 2024

thanks for all that, useful, I'll try to incorporate some of this into the docs...the OSX binary sounds interesting, I can build one but I have no means to test one however as I don't have a mac, but I might have a solution for that if I can enlist someone to test it out for me, let me think about that possibility. Thanks for testing the Operator out, very useful feedback. A LOT more is coming in the coming weeks which will further impact the documentation and user experience, hopefully in positive ways :^))

from postgres-operator.

rimusz commented on May 13, 2024

I use macOS too, so would love to have binary for it available, as right now I have to build it myself.

from postgres-operator.

jar349 commented on May 13, 2024

I just made the mistake of building the operatorimage from source on osx and then trying to deploy it to kubernetes. Binary format problems, obviously. I knew this would happen and just forgot. It would be great to add this to the build instructions - that since go binaries essentially have go runtime bundled with them, you must build on a platform that is compatible with the target deployment platform.

from postgres-operator.

xenophenes commented on May 13, 2024

@jar349 -

Updated documentation taking into account and hopefully resolving all of your concerns. Thank you so much for the descriptive feedback! Some notes of things that were not added in specifically in the pull request:

“Also, kubernetes has deprecated the creation of a stand-alone Pod and now recommends that pods be deployed into kubernetes via a Deployment.”

That’s how the operator functions already; it deploys clusters via Deployments. See here for more information.

“Even then, I didn't see a place in the config file to set the 'default' namespace to assume and my desire to type --namespace default every single time was, and remains, zero.”

In the Build documentation:
“Also, you can set the namespace to use for deploying the operator as follows in your .bashrc file, if not set, the namespace is set to default:

export CO_NAMESPACE=myproject”

Admittedly, this was located only under the OpenShift instructions; because of that, I moved it under both the Kubernetes and OpenShift sections.

“I think that making some sane assumptions would greatly simplify pgo: default as the namespace.”

Following up from the previous mention, this behavior of pgo of setting the namespace to default is defined under the setup.sh file:

if [ -z "$CO_NAMESPACE" ]; then echo "CO_NAMESPACE not set, using default" export CO_NAMESPACE=default fi if [ -z "$CO_CMD" ]; then echo "CO_CMD not set, using kubectl" export CO_CMD=kubectl fi

“I really don't understand what the lspvc template/pod is for - would like some documentation on that.”

This is defined in the table defining .pgo.yaml variables, under LSPVC_TEMPLATE:

https://github.com/CrunchyData/postgres-operator/blob/master/docs/config.asciidoc#pgo-configuration-format

It is also described in the Docker image on Dockerhub; basically, think of it as 'ls pvc' - it just lists the contents of a PVC.

“One problem is that the nfs example uses a variable called $COUNTER, which isn't documented. It's clear that it's means to be some kind of number, but I don't know where it should be set, or who increments it. For now, I just replace it with a hard-coded zero.”

If you view create-pv-nfs.sh and create-pv.sh,
echo "create the test PV and PVC using the NFS dir" for i in {1..60} do echo "creating PV crunchy-pv$i" export COUNTER=$i

It essentially just creates crunchy-pv containers 1-60 in those two files, and in the NFS example it
chooses the next PV number that hasn't been used yet.

“I wasn't sure what to use for STORAGE_TYPE…”

The STORAGE_TYPE variables are all defined here.

https://github.com/CrunchyData/postgres-operator/blob/master/docs/config.asciidoc#persistence

Please let us know if there are other things we can update in the documentation and we'll work it in.

from postgres-operator.

Clarify that configuration of pgo cli is required, not optional about postgres-operator HOT 10 CLOSED

Comments (10)

notes from trying to install postgres-operator

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent