Comments (3)
hum I see some erros in the package server pods, perhaps this is related k logs -n olm packageserver-564ccbc799-2xpdd
W0604 12:37:48.936069 1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:37:49.940571 1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:37:51.253637 1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:37:54.222869 1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
time="2024-06-04T12:37:55Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T12:37:55Z" level=warning msg="error getting bundle stream" action="refresh cache" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused\"" source="{operatorhubio-catalog olm}"
W0604 12:37:57.995032 1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:38:03.871918 1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
time="2024-06-04T12:38:09Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T12:38:09Z" level=warning msg="error getting bundle stream" action="refresh cache" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused\"" source="{operatorhubio-catalog olm}"
time="2024-06-04T12:38:12Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T12:38:12Z" level=warning msg="error getting bundle stream" action="refresh cache" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused\"" source="{operatorhubio-catalog olm}"
time="2024-06-04T12:51:14Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:07:39Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:07:41Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:37:39Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:37:41Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T14:07:39Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T14:07:41Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
the service seems to be running though:
kgs -n olm
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
operatorhubio-catalog ClusterIP 10.236.9.61 <none> 50051/TCP 107m
packageserver-service ClusterIP 10.236.2.50 <none> 5443/TCP 107m
from operator-lifecycle-manager.
I restarted the catalog and packageserver pods, I see no error now:
k logs -n olm packageserver-564ccbc799-d65xt
time="2024-06-04T14:33:35Z" level=info msg="Using in-cluster kube client config"
time="2024-06-04T14:33:35Z" level=info msg="Defaulting Interval to '12h0m0s'"
I0604 14:33:35.593163 1 handler.go:275] Adding GroupVersion packages.operators.coreos.com v1 to ResourceManager
time="2024-06-04T14:33:35Z" level=info msg="connection established. cluster-version: v1.28.9-gke.1000000"
time="2024-06-04T14:33:35Z" level=info msg="operator ready"
time="2024-06-04T14:33:35Z" level=info msg="starting informers..."
time="2024-06-04T14:33:35Z" level=info msg="informers started"
time="2024-06-04T14:33:35Z" level=info msg="waiting for caches to sync..."
time="2024-06-04T14:33:35Z" level=info msg="starting workers..."
time="2024-06-04T14:33:35Z" level=info msg="connecting to source" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
I0604 14:33:35.713989 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0604 14:33:35.714018 1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController
I0604 14:33:35.714054 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0604 14:33:35.714070 1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController
I0604 14:33:35.713994 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0604 14:33:35.714131 1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.714137 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0604 14:33:35.714154 1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.714217 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0604 14:33:35.714226 1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.714416 1 secure_serving.go:213] Serving securely on [::]:5443
I0604 14:33:35.714500 1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::apiserver.local.config/certificates/apiserver.crt::apiserver.local.config/certificates/apiserver.key"
I0604 14:33:35.714555 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0604 14:33:35.714567 1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.714677 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
time="2024-06-04T14:33:35Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
I0604 14:33:35.814896 1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.814971 1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController
I0604 14:33:35.815091 1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.815251 1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.815421 1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.815736 1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController
time="2024-06-04T14:33:38Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T14:37:51Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
Yet this is still installing:
k get csv packageserver -n olm
NAME DISPLAY VERSION REPLACES PHASE
packageserver Package Server 0.28.0 Installing
The connection issues seems like a bug, all of this is installed by OLM, nothing is manual so it seems like after install the catalog service might not be starting properly.
EZ to workaround by restarting, but how does one now get the CSV to continue... ?
from operator-lifecycle-manager.
I have to thank @joelanford for helping debug this over slack, this is in fact a network issue, but likely specific to GKE, possible EKS (or very likely, if I recall they also set some defautl firewall rules).
For GKE you have to add this port:
packageserver-service ClusterIP 10.236.4.66 <none> 5443/TCP 56m
Go to GCP firewall rules and search for a existing rule named "gke.........master" this is where it needs to be added :)
from operator-lifecycle-manager.
Related Issues (20)
- OLM generate bundle silently drops k8s resources existing in manifests HOT 5
- InstallPlan is missing expected OwnerReference to Subscription HOT 1
- I want to use my privet registry for all images
- The catalog source pod can not migrate when the node becomes NotReady HOT 2
- Subscription is stuck when installing same operator multiple times into different namespaces at different dates
- Improve Makefile to avoid unnecessarily shelling out multiple times for the same variables
- support cacheless catalogsources
- Deployment "olm/packageserver" never appears on install HOT 1
- Startup Probe kills "/bin/opm serve" process and prevents operatorhubio pod to start HOT 1
- Unpin prometheus library versions from go.mod
- Unpin google.golang.org/grpc from go.mod
- Move away from actions/create-release@v1 HOT 1
- Remove downstream e2e test
- Automate test image (re)generation
- Drop quay.io/olmtest
- OLM - running in a restricted environment.
- OLM v0.28.0 - Subscription with toleration NOT WORKING (Bug)
- Critical and High CVEs in latest OLM version ("operatorhubio/catalog")
- Trying to add an updated version of a CRD and encountering problems
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from operator-lifecycle-manager.