Code Monkey home page Code Monkey logo

Comments (3)

mike-pt avatar mike-pt commented on July 22, 2024

hum I see some erros in the package server pods, perhaps this is related k logs -n olm packageserver-564ccbc799-2xpdd

W0604 12:37:48.936069       1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:37:49.940571       1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:37:51.253637       1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:37:54.222869       1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
time="2024-06-04T12:37:55Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T12:37:55Z" level=warning msg="error getting bundle stream" action="refresh cache" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused\"" source="{operatorhubio-catalog olm}"
W0604 12:37:57.995032       1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
W0604 12:38:03.871918       1 logging.go:55] [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "operatorhubio-catalog.olm.svc:50051", ServerName: "operatorhubio-catalog.olm.svc:50051", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused"
time="2024-06-04T12:38:09Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T12:38:09Z" level=warning msg="error getting bundle stream" action="refresh cache" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused\"" source="{operatorhubio-catalog olm}"
time="2024-06-04T12:38:12Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T12:38:12Z" level=warning msg="error getting bundle stream" action="refresh cache" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.236.9.61:50051: connect: connection refused\"" source="{operatorhubio-catalog olm}"
time="2024-06-04T12:51:14Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:07:39Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:07:41Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:37:39Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T13:37:41Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T14:07:39Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T14:07:41Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm

the service seems to be running though:

kgs -n olm
NAME                    TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)     AGE
operatorhubio-catalog   ClusterIP   10.236.9.61   <none>        50051/TCP   107m
packageserver-service   ClusterIP   10.236.2.50   <none>        5443/TCP    107m

from operator-lifecycle-manager.

mike-pt avatar mike-pt commented on July 22, 2024

I restarted the catalog and packageserver pods, I see no error now:

k logs -n olm packageserver-564ccbc799-d65xt
time="2024-06-04T14:33:35Z" level=info msg="Using in-cluster kube client config"
time="2024-06-04T14:33:35Z" level=info msg="Defaulting Interval to '12h0m0s'"
I0604 14:33:35.593163       1 handler.go:275] Adding GroupVersion packages.operators.coreos.com v1 to ResourceManager
time="2024-06-04T14:33:35Z" level=info msg="connection established. cluster-version: v1.28.9-gke.1000000"
time="2024-06-04T14:33:35Z" level=info msg="operator ready"
time="2024-06-04T14:33:35Z" level=info msg="starting informers..."
time="2024-06-04T14:33:35Z" level=info msg="informers started"
time="2024-06-04T14:33:35Z" level=info msg="waiting for caches to sync..."
time="2024-06-04T14:33:35Z" level=info msg="starting workers..."
time="2024-06-04T14:33:35Z" level=info msg="connecting to source" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
I0604 14:33:35.713989       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0604 14:33:35.714018       1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController
I0604 14:33:35.714054       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0604 14:33:35.714070       1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController
I0604 14:33:35.713994       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0604 14:33:35.714131       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.714137       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0604 14:33:35.714154       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.714217       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0604 14:33:35.714226       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.714416       1 secure_serving.go:213] Serving securely on [::]:5443
I0604 14:33:35.714500       1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::apiserver.local.config/certificates/apiserver.crt::apiserver.local.config/certificates/apiserver.key"
I0604 14:33:35.714555       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0604 14:33:35.714567       1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.714677       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
time="2024-06-04T14:33:35Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
I0604 14:33:35.814896       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.814971       1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController
I0604 14:33:35.815091       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.815251       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0604 14:33:35.815421       1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0604 14:33:35.815736       1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController
time="2024-06-04T14:33:38Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm
time="2024-06-04T14:37:51Z" level=info msg="updating PackageManifest based on CatalogSource changes: {operatorhubio-catalog olm}" action="sync catalogsource" address="operatorhubio-catalog.olm.svc:50051" name=operatorhubio-catalog namespace=olm

Yet this is still installing:

k get csv packageserver -n olm
NAME            DISPLAY          VERSION   REPLACES   PHASE
packageserver   Package Server   0.28.0               Installing

The connection issues seems like a bug, all of this is installed by OLM, nothing is manual so it seems like after install the catalog service might not be starting properly.

EZ to workaround by restarting, but how does one now get the CSV to continue... ?

from operator-lifecycle-manager.

mike-pt avatar mike-pt commented on July 22, 2024

I have to thank @joelanford for helping debug this over slack, this is in fact a network issue, but likely specific to GKE, possible EKS (or very likely, if I recall they also set some defautl firewall rules).

For GKE you have to add this port:


packageserver-service   ClusterIP   10.236.4.66    <none>        5443/TCP    56m

Go to GCP firewall rules and search for a existing rule named "gke.........master" this is where it needs to be added :)

from operator-lifecycle-manager.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.