Comments (7)
Is there a better way to fix this? It kind of looks like projectcalico/calico#5135, but not sure if the problem is in Calico or AWS.
Do you have both Calico and VPC CNI? Do you where the specific error message is coming from?
Could you share the status of your pods aws-node
and other pods in kube-system
namespace?
- What does the pod log of the CrashLoopBackOff Containers say?
- What does the IPAMD log say ?
from amazon-vpc-cni-k8s.
Yep, we have both installed.
The errors are in pod logs and it's somewhat random what pods have errors. Usually they are connection refused errors connecting to the Kube API or other pods, e.g.:
Invalid Kubernetes API v1 endpoint https://172.20.0.1:443/api: Timed out connecting to server
Or connecting to another pod:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='nucleus-frontend', port=80)
The exact errors vary by things like the language used and what they're connecting to. In all cases DNS works correctly, but the packets aren't routed to the other pod/service.
Is there a secure way to send you logs and pod statuses?
from amazon-vpc-cni-k8s.
Invalid Kubernetes API v1 endpoint https://172.20.0.1:443/api: Timed out connecting to server
This is strange error message.
Can you confirm the the API server endpoint match?
kubectl get endpoints kubernetes -o jsonpath='{.subsets[].addresses[].ip}'
I would expect the API path to be /api/v1
in the error message. I am not sure why it tried to connect at /api
You can follow this troubleshooting doc - https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/troubleshooting.md and send the logs to '[email protected]' for us to investigate.
I am suspecting that kube-proxy isn't running when this error occurred, but the description of the error itself isn't typical either.
from amazon-vpc-cni-k8s.
It's not just the kubernetes api, it's basically random what services and pods can be connected and which can't e.g. a pod won't be able to connect to our rabbitmq service, or another one will be able to connect to rabbitmq, but won't connect to vault etc.
We've fixed this by draining/cordoning the node on startup. I'll try tracking down the bundle of logs and sending them through.
from amazon-vpc-cni-k8s.
We've fixed this by draining/cordoning the node on startup.
Was this node specific behavior? If yes, perhaps there is some thing it is running on the node that changing iptables. Yes, logs will help.
from amazon-vpc-cni-k8s.
Closing this as Cx were able to resolve this at the node level.
from amazon-vpc-cni-k8s.
This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
from amazon-vpc-cni-k8s.
Related Issues (20)
- Upgrading from v1.16.0-eksbuild.1 to v1.17 or v1.18 results in failure to assign IP address to container HOT 9
- RefreshSecurityGroups should only be called on ENIs already checked by the ENI/IP reconciler HOT 8
- Conflicts .data.enable-windows-ipam HOT 2
- Improve VPC CNI memory by reducing number of things it is caching HOT 6
- Pod stuck in `ContainerCreating` status while waiting for an IP address to get assigned HOT 12
- ip addresses leaking when there are too many ip in cooldown pool HOT 2
- Should node agent be opt-in on vpc CNI HOT 2
- Enhanced subnet discovery should use configurable tags
- make generate-limits script failed due to ENI limit mismatch HOT 3
- Confusing environment variable names HOT 2
- Create secondary ENI when previous ENI isn't full due to lack of IPs in subnet - enhanced subnet discovery
- Cannot connect to AWS resources when using pod security groups HOT 2
- Need aws-vpc-cni dockerfile in ironbank repo HOT 2
- EKS EBS CSI addon - New node group issue "add cmd: failed to assign an IP address to container" HOT 1
- The user eks:vpc-resource-controller doesn't have permission to patch cninode HOT 5
- Disabling SNAT for non-managed ENIs possible?
- Security Group for pods -ENI without IPv4 address in dual stack subnets
- VPC CNI stuck in crash loop without insights HOT 4
- ipamd | Failed to delete eniConfig
- iptables contention between vpc-cni and kube-proxy HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amazon-vpc-cni-k8s.