Comments (16)
Aha. I had a cluster of two nodes. I had flannel, CoreDNS, NGINX Ingress controller and a simple whoami service installed. Two replicas.
I had this command active while running the upgrade:
while true; do curl 'http://whoami.anton-johansson.local'; sleep 0.1; done
from kubernetes-the-right-way.
Manifest for the whoami service:
---
kind: Namespace
apiVersion: v1
metadata:
name: whoami
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: whoami
namespace: whoami
labels:
app.kubernetes.io/name: whoami
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/name: whoami
template:
metadata:
labels:
app.kubernetes.io/name: whoami
spec:
securityContext:
runAsUser: 1000
containers:
- name: whoami
image: containous/whoami:v1.0.1
ports:
- name: http
containerPort: 80
protocol: TCP
args:
- '-port'
- '12345'
---
kind: Service
apiVersion: v1
metadata:
name: whoami
namespace: whoami
labels:
app.kubernetes.io/name: whoami
spec:
selector:
app.kubernetes.io/name: whoami
ports:
- port: 8080
targetPort: 12345
protocol: TCP
---
kind: Ingress
apiVersion: networking.k8s.io/v1beta1
metadata:
name: whoami
namespace: whoami
labels:
app.kubernetes.io/name: whoami
annotations:
kubernetes.io/ingress.class: external
spec:
rules:
- host: whoami.anton-johansson.local
http:
paths:
- path: /
backend:
serviceName: whoami
servicePort: 8080
from kubernetes-the-right-way.
Oh, that explains why I thought I was going crazy then. Nice find!
from kubernetes-the-right-way.
I've decided to perform the upgrade on each worker node individually, where I can perform pre- and post-maintenance actions, like so:
# Upgrade all masters, in an orderly fashion
$ ansible-playbook --inventory my-inventory --extra-vars "serial_all=1" --limit "etcd,masters" ~/projects/kubernetes-the-right-way/install.yml
# Upgrade each worker individually
$ kubectl cordon k8s-node-1
$ ./do-something-to-failover-metallb k8s-node-1
$ ansible-playbook --inventory my-inventory --limit "k8s-node-1" ~/projects/kubernetes-the-right-way/install.yml
$ kubectl uncordon k8s-node-1
from kubernetes-the-right-way.
Here are an example from the pause
Ansible module:
# Pause for 5 minutes to build app cache.
- pause:
minutes: 5
# Pause until you can verify updates to an application were successful.
- pause:
# A helpful reminder of what to look out for post-update.
- pause:
prompt: "Make sure org.foo.FooOverload exception is not present"
# Pause to get some sensitive input.
- pause:
prompt: "Enter a secret"
echo: no
I'm not sure how well this would deal with serial_all
though.
from kubernetes-the-right-way.
I haven't been able to replicate the problem. I even tried with stopping containerd completely on two nodes. One running a nginx ingress controller, and the other a simple web app. I could still access the web app. However I did run into problems when restarting kubelet which seems to cause downtime. I think this is due to how networking is provisioned though kubelet's with the services. In either case, containers (or processes), are not restarted when either of kubelet or containerd are restarted. However pod connectivity is affected unfortunately.
from kubernetes-the-right-way.
Hmm, you're probably right. But it's probably not kubelet
. kube-proxy
is more likely, as that's the component handling the network routing.
Strange that you cannot reproduce though... How did you test your pods during the upgrade process?
from kubernetes-the-right-way.
That makes sense. I didn't do an upgrade, just took down containerd and kubelet. I'll do an upgrade and see what happens. Can you share any particular curl command to run during the upgrade? I would appreciate it.
from kubernetes-the-right-way.
Wait wait wait wait... I have already had this up for discussion before (#48) and you fixed it by restarting services in handlers and waiting for services to come back up (#52).
This issue is related to something else. I'm using MetalLB (layer2 mode) to assign extenral IP addresses for my services. MetalLB layer2 mode isn't active/active, so only one destination is being routed to at any given time. When I stop kube-proxy
on that node, it takes about a second for MetalLB to redirect traffic to another node that has the target pod, causing downtime during that second.
I think it's safe to close this issue, however I'm gonna have to research a way to gracefully handle these issues, by telling MetalLB that a node is going into maintenance.
Do you have any ideas or suggestions on the subject?
EDIT:
I found metallb/metallb#494, which seems to describe the issue and a possible solution. It requires quite a bit more from the upgrade process though...
EDIT 2:
Now I'm being super dizzy. When upgrading a cluster today at work, I realized that MetalLB was the cause of the downtime (as described above). When I ran it at home a couple of days ago, I did not have MetalLB and still experienced issues. That I cannot explain. I'll see if it happens next time, when doing the same for v1.16.7
.
from kubernetes-the-right-way.
@amimof Since we use MetalLB and it gives us a short period of downtime, I need a way of handling this when running the playbook.
Alternative 1:
Simply stop and await user input before starting upgrading each node. This allows me to manually and gracefully force MetalLB to announce IP addresses on another host before proceeding. This is the easy alternative, and could easily be configurable (don't await user input by default).
Alternative 2:
Allow hooking into the playbook somehow. Allow running a certain command before the Ansible roles of the node are executed and also allow running a command after the Ansible roles of the node. This would allow me to automatically force MetalLB to switch host. However, I'm not sure how this would be done. Maybe two new roles, pre and post, which can somehow be configured with the inventory
file?
Do you have any thoughts?
from kubernetes-the-right-way.
I've tried reproducing my problem by shutting down kube-proxy
and even kubelet
and containerd
as well, but my services just seem to work anyway. At first, I thought they would suffer a small downtime when MetalLB switched over announcement to another node, but it just doesn't happen like it did when actually upgrading Kubernetes version.
I think I need to dig a bit more into MetalLB to understand how everything works before proceeding here. I'll also see if happens when upgrading our cluster to 1.16.7.
from kubernetes-the-right-way.
So I have been doing some testing and I managed to reproduce the problem, but only if upgrading. Which led me to believe that it has something to do with how Kubernetes handles versions of various components. I upgraded a 1.15.10 cluster to 1.16.7.
Then I discovered that if the kubelet is restarted with a version that is newer (or older) then the one before it, it causes the pods on it to restart. But if kubelet is restarted without replacing the binary nothing happens. At this stage I have no idea why this happens.
from kubernetes-the-right-way.
@amimof Any thoughts on how I can perform this in a "handled" way? A simple pause (wait for input) would suffice, then I could manually failover my MetalLB IP address.
An automatic hook would be ideal, but I'm not sure how that would be built.
from kubernetes-the-right-way.
I can't seem to find any solution to this unfortunately. Would it be possible to do your MetalLB failovers before the upgrade. And only run the upgrade on parts of the cluster. A pause function is basically an anti-pattern automation-wise.
from kubernetes-the-right-way.
Yeah, I don't see any good solution for a fully automated process here. I have some ideas for a tool that I can use to fail-over (cordon the node being upgraded, delete the pod on the node (if it's currently the MetalLB "master") and uncordon. MetalLB will then announce from another node and I'm free to run the upgrade.
One idea would be to allow running an executable as a pre-step before each node is upgraded? Then you can do whatever you want, sleep, ask for input, force failover, or whatever you like.
Again, not sure how this works with anything other than serial_all=1
, but still.
from kubernetes-the-right-way.
I guess you could define two node groups in your inventory and run the upgrade twice. One for each group. And you could do the cordon/uncordon in between each run.
from kubernetes-the-right-way.
Related Issues (20)
- All nodes but the first one gets "unauthorized" HOT 1
- Pod networking HOT 17
- /etc/resolv.conf HOT 2
- Pod security policies HOT 2
- Configurable options to API server HOT 5
- Kubernetes v1.14 HOT 1
- Ansible variable for keys check HOT 2
- Adding/removing etcd nodes HOT 2
- Prometheus metrics for Kubernetes components HOT 5
- Metrics for containerd HOT 1
- kube-proxy on master nodes? HOT 3
- Ansible deprecation warnings in tests
- Bump Kubernetes version to v1.14.4
- Configurable expire dates on certificates HOT 1
- Service cluster IP range/CIDR HOT 3
- Custom parameters to all components HOT 1
- Unwanted changes when running playbook HOT 1
- Additional node tools HOT 1
- Permissions on config directories HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubernetes-the-right-way.