Code Monkey home page Code Monkey logo

Comments (7)

orsenthil avatar orsenthil commented on June 11, 2024

Is there a better way to fix this? It kind of looks like projectcalico/calico#5135, but not sure if the problem is in Calico or AWS.

Do you have both Calico and VPC CNI? Do you where the specific error message is coming from?
Could you share the status of your pods aws-node and other pods in kube-system namespace?

  1. What does the pod log of the CrashLoopBackOff Containers say?
  2. What does the IPAMD log say ?

from amazon-vpc-cni-k8s.

ddl-pjohnson avatar ddl-pjohnson commented on June 11, 2024

Yep, we have both installed.

The errors are in pod logs and it's somewhat random what pods have errors. Usually they are connection refused errors connecting to the Kube API or other pods, e.g.:
Invalid Kubernetes API v1 endpoint https://172.20.0.1:443/api: Timed out connecting to server

Or connecting to another pod:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='nucleus-frontend', port=80)

The exact errors vary by things like the language used and what they're connecting to. In all cases DNS works correctly, but the packets aren't routed to the other pod/service.

Is there a secure way to send you logs and pod statuses?

from amazon-vpc-cni-k8s.

orsenthil avatar orsenthil commented on June 11, 2024

Invalid Kubernetes API v1 endpoint https://172.20.0.1:443/api: Timed out connecting to server

This is strange error message.
Can you confirm the the API server endpoint match?

kubectl get endpoints kubernetes -o jsonpath='{.subsets[].addresses[].ip}'

I would expect the API path to be /api/v1 in the error message. I am not sure why it tried to connect at /api

You can follow this troubleshooting doc - https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/troubleshooting.md and send the logs to '[email protected]' for us to investigate.

I am suspecting that kube-proxy isn't running when this error occurred, but the description of the error itself isn't typical either.

from amazon-vpc-cni-k8s.

ddl-pjohnson avatar ddl-pjohnson commented on June 11, 2024

It's not just the kubernetes api, it's basically random what services and pods can be connected and which can't e.g. a pod won't be able to connect to our rabbitmq service, or another one will be able to connect to rabbitmq, but won't connect to vault etc.

We've fixed this by draining/cordoning the node on startup. I'll try tracking down the bundle of logs and sending them through.

from amazon-vpc-cni-k8s.

orsenthil avatar orsenthil commented on June 11, 2024

We've fixed this by draining/cordoning the node on startup.

Was this node specific behavior? If yes, perhaps there is some thing it is running on the node that changing iptables. Yes, logs will help.

from amazon-vpc-cni-k8s.

orsenthil avatar orsenthil commented on June 11, 2024

Closing this as Cx were able to resolve this at the node level.

from amazon-vpc-cni-k8s.

github-actions avatar github-actions commented on June 11, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

from amazon-vpc-cni-k8s.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.