Code Monkey home page Code Monkey logo

Comments (97)

chrisohaver avatar chrisohaver commented on June 1, 2024 2

The fix in #28 appears to have fixed the issue.

I was able to reproduce the issue without the #28 fix within a few seconds by hammering coredns with two parallel thread of queries. With the fix, the problem does not occur.

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024 1

@chrisohaver one quick question on sync time period w..r.t to multiple stanzas in core file. Please clarify.
What is the resync time period with API servers(local and remote)?
Is this sync a pull request from coredns or push from API server. On tcp connection we see continuous messages sent by API server. As these are TLS we are not sure what are these messages ?

from kubernetai.

johnbelamaric avatar johnbelamaric commented on June 1, 2024 1

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

It is completely unpredictable now.

$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# 
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.33.149
cb-test1.default.svc.cluster.local has address 10.223.36.104
dnstools# 
dnstools# 
dnstools# 
dnstools# 
dnstools# host nginxd
Host nginxd not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# host nginxd
nginxd.default.svc.cluster.local has address 10.233.251.220
Host nginxd.default.svc.cluster.local not found: 3(NXDOMAIN)

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Are the "remote" services stable? That is, when you get NXDOMAIN, are their endpoints/pods ready?

Can you use a more precise tool, like dig instead of host. Other tools such as host hide details in both the output, and the request it makes for you.

If you can, please also show the coredns logs during these failures.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Hi @chrisohaver

The remote services are stable and no one else is working on that k8s cluster.

This is how the pod and services look like on the remote k8s.

$ k get pods
NAME                                            READY   STATUS    RESTARTS   AGE
cb-test1-0000                                   1/1     Running   0          12d
cb-test1-0001                                   1/1     Running   0          12d
couchbase-operator-7654d844cb-q6qwf             1/1     Running   0          12d
couchbase-operator-admission-7ff868f54c-xkjfm   1/1     Running   0          12d
nginxd-8fd4b98b-xr94l                           1/1     Running   0          6d1h
$ k get svc
NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                          AGE
cb-test1                       ClusterIP   None             <none>        8091/TCP,18091/TCP               12d
cb-test1-srv                   ClusterIP   None             <none>        11210/TCP,11207/TCP              12d
cb-test1-ui                    NodePort    10.233.171.114   <none>        8091:36143/TCP,18091:44380/TCP   12d
couchbase-operator-admission   ClusterIP   10.233.101.72    <none>        443/TCP                          12d
kubernetes                     ClusterIP   10.233.0.1       <none>        443/TCP                          13d
nginxd                         ClusterIP   10.233.251.220   <none>        80/TCP                           6d7h

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Commands

dnstools# host cb-test1
Host cb-test1 not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# host kubernetes
kubernetes.default.svc.cluster.local has address 10.233.0.1
dnstools# 
dnstools# 
dnstools# dig cb-test1.default.svc.cluster.local

; <<>> DiG 9.11.3 <<>> cb-test1.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 8481
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 37be9998993e1b13 (echoed)
;; QUESTION SECTION:
;cb-test1.default.svc.cluster.local. IN	A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1562646782 7200 1800 86400 5

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Tue Jul 09 18:47:59 UTC 2019
;; MSG SIZE  rcvd: 168

dnstools# 

Logs

CoreDNS Pod1

2019-07-09T18:47:33.632Z [INFO] 10.240.1.43:44683 - 29705 "A IN cb-test1. udp 26 false 512" NXDOMAIN qr,rd,ra 101 0.000877701s
2019-07-09T18:47:39.664Z [INFO] 10.240.1.43:33376 - 21394 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000152475s
2019-07-09T18:47:39.665Z [INFO] 10.240.1.43:60644 - 61597 "AAAA IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000123909s

CoreDNS Pod2

2019-07-09T18:47:19.232Z [INFO] 10.240.1.43:53553 - 1622 "A IN cb-test1. udp 49 false 4096" NXDOMAIN qr,rd,ra 112 0.000558867s
2019-07-09T18:47:29.457Z [INFO] 10.240.1.43:43611 - 63761 "A IN cb-test1.default.svc.cluster.local. udp 52 false 512" NXDOMAIN qr,aa,rd 145 0.000143867s
2019-07-09T18:47:29.457Z [INFO] 10.240.1.43:42445 - 39010 "A IN cb-test1.svc.cluster.local. udp 44 false 512" NXDOMAIN qr,aa,rd 137 0.00012489s
2019-07-09T18:47:29.458Z [INFO] 10.240.1.43:38577 - 19558 "A IN cb-test1.cluster.local. udp 40 false 512" NXDOMAIN qr,aa,rd 133 0.000105436s
2019-07-09T18:47:29.458Z [INFO] 10.240.1.43:43389 - 17010 "A IN cb-test1.us-west-2.compute.internal. udp 53 false 512" NXDOMAIN qr,aa,rd 150 0.000185465s
2019-07-09T18:47:29.459Z [INFO] 10.240.1.43:59686 - 12056 "A IN cb-test1.compute.internal. udp 43 false 512" NXDOMAIN qr,aa,rd 140 0.000118657s
2019-07-09T18:47:35.494Z [INFO] 10.240.1.43:33028 - 44728 "MX IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.00015827s
2019-07-09T18:47:57.259Z [INFO] 10.240.1.5:39270 - 41945 "AAAA IN autoscaling.us-west-2.amazonaws.com.cluster.local. udp 67 false 512" NXDOMAIN qr,aa,rd 160 0.

Other logs (there are many such log lines)

2019-07-09T18:44:00.974Z [INFO] 10.240.1.5:33301 - 25400 "A IN autoscaling.us-west-2.amazonaws.com.compute.internal. udp 70 false 512" NXDOMAIN qr,aa,rd 167 0.000142165s
2019-07-09T18:44:00.975Z [INFO] 10.240.1.5:44984 - 16953 "AAAA IN autoscaling.us-west-2.amazonaws.com. udp 53 false 512" NOERROR qr,rd,ra 156 0.001166823s
2019-07-09T18:44:36.232Z [INFO] 10.240.2.0:52142 - 4162 "A IN public.update.core-os.net. udp 43 false 512" NOERROR qr,rd,ra 575 0.000508107s
2019-07-09T18:44:36.232Z [INFO] 10.240.2.0:52142 - 25674 "AAAA IN public.update.core-os.net. udp 43 false 512" NOERROR qr,rd,ra 233 0.000934546s
2019-07-09T18:45:01.103Z [INFO] 10.240.1.5:41312 - 36198 "AAAA IN autoscaling.us-west-2.amazonaws.com.kube-system.svc.cluster.local. udp 83 false 512" NXDOMAIN qr,aa,rd 176 0.000141301s
2019-07-09T18:45:01.104Z [INFO] 10.240.1.5:53117 - 44507 "AAAA IN autoscaling.us-west-2.amazonaws.com.cluster.local. udp 67 false 512" NXDOMAIN qr,aa,rd 160 0.000112282s
2019-07-09T18:45:01.104Z [INFO] 10.240.1.5:38164 - 29618 "A IN autoscaling.us-west-2.amazonaws.com.cluster.local. udp 67 false 512" NXDOMAIN qr,aa,rd 160 0.000124262s
2019-07-09T18:45:01.105Z [INFO] 10.240.1.5:54859 - 54413 "AAAA IN autoscaling.us-west-2.amazonaws.com.default.svc.cluster.local. udp 79 false 512" NXDOMAIN qr,aa,rd 172 0.000097898s
2019-07-09T18:45:01.105Z [INFO] 10.240.1.5:51074 - 57474 "A IN autoscaling.us-west-2.amazonaws.com.default.svc.cluster.local. udp 79 false 512" NXDOMAIN qr,aa,rd 172 0.000096969s
2019-07-09T18:45:01.106Z [INFO] 10.240.1.5:49740 - 61871 "AAAA IN autoscaling.us-west-2.amazonaws.com.us-west-2.compute.internal. udp 80 false 512" NXDOMAIN qr,aa,rd 177 0.000139668s
2019-07-09T18:45:01.106Z [INFO] 10.240.1.5:34560 - 57669 "A IN autoscaling.us-west-2.amazonaws.com.us-west-2.compute.internal. udp 80 false 512" NXDOMAIN qr,aa,rd 177 0.000182925s
2019-07-09T18:45:01.109Z [INFO] 10.240.1.5:52530 - 8611 "A IN autoscaling.us-west-2.amazonaws.com. udp 53 false 512" NOERROR qr,rd,ra 913 0.001250304s
2019-07-09T18:46:01.211Z [INFO] 10.240.1.5:33419 - 41471 "AAAA IN autoscaling.us-west-2.amazonaws.com.kube-system.svc.cluster.local. udp 83 false 512" NXDOMAIN qr,aa,rd 176 0.000156446s
2019-07-09T18:46:01.212Z [INFO] 10.240.1.5:59119 - 64093 "AAAA IN autoscaling.us-west-2.amazonaws.com.svc.cluster.local. udp 71 false 512" NXDOMAIN qr,aa,rd 173 0.000169874s
2019-07-09T18:46:01.213Z [INFO] 10.240.1.5:45080 - 50093 "AAAA IN autoscaling.us-west-2.amazonaws.com.cluster.local. udp 67 false 512" NXDOMAIN qr,aa,rd 160 0.000091638s
2019-07-09T18:46:01.213Z [INFO] 10.240.1.5:50292 - 30461 "AAAA IN autoscaling.us-west-2.amazonaws.com.default.svc.cluster.local. udp 79 false 512" NXDOMAIN qr,aa,rd 172 0.0000985s
2019-07-09T18:46:01.214Z [INFO] 10.240.1.5:37451 - 62031 "AAAA IN autoscaling.us-west-2.amazonaws.com.us-west-2.compute.internal. udp 80 false 512" NXDOMAIN qr,aa,rd 177 0.000118527s
2019-07-09T18:46:01.214Z [INFO] 10.240.1.5:49168 - 39235 "A IN autoscaling.us-west-2.amazonaws.com.us-west-2.compute.internal. udp 80 false 512" NXDOMAIN qr,aa,rd 177 0.00012768s
2019-07-09T18:46:01.215Z [INFO] 10.240.1.5:51132 - 44998 "A IN autoscaling.us-west-2.amazonaws.com.compute.internal. udp 70 false 512" NXDOMAIN qr,aa,rd 167 0.00011468s
2019-07-09T18:46:01.215Z [INFO] 10.240.1.5:40791 - 45354 "AAAA IN autoscaling.us-west-2.amazonaws.com.compute.internal. udp 70 false 512" NXDOMAIN qr,aa,rd 167 0.000127792s
2019-07-09T18:46:01.218Z [INFO] 10.240.1.5:45628 - 54834 "A IN autoscaling.us-west-2.amazonaws.com. udp 53 false 512" NOERROR qr,rd,ra 913 0.001479689s
2019-07-09T18:46:41.792Z [INFO] 10.240.1.0:54014 - 64054 "AAAA IN auth.docker.io. udp 32 false 512" NOERROR qr,rd,ra 125 0.035980896s
2019-07-09T18:46:41.794Z [INFO] 10.240.1.0:54014 - 2603 "A IN auth.docker.io. udp 32 false 512" NOERROR qr,rd,ra 792 0.037591496s
2019-07-09T18:46:42.109Z [INFO] 10.240.1.0:57233 - 23804 "AAAA IN registry-1.docker.io. udp 38 false 512" NOERROR qr,rd,ra 131 0.000874667s
2019-07-09T18:46:42.110Z [INFO] 10.240.1.0:57233 - 31989 "A IN registry-1.docker.io. udp 38 false 512" NOERROR qr,rd,ra 734 0.001293918s
2019-07-09T18:52:58.011Z [INFO] 10.240.1.5:42280 - 11649 "A IN autoscaling.us-west-2.amazonaws.com. udp 53 false 512" NOERROR qr,rd,ra 913 0.140696992s
2019-07-09T18:53:40.797Z [INFO] 10.240.0.7:42846 - 3209 "AAAA IN grafana.com.kube-system.svc.cluster.local. udp 59 false 512" NXDOMAIN qr,aa,rd 152 0.000202411s
2019-07-09T18:53:40.798Z [INFO] 10.240.0.7:53721 - 16344 "AAAA IN grafana.com.svc.cluster.local. udp 47 false 512" NXDOMAIN qr,aa,rd 149 0.000210925s
2019-07-09T18:53:40.799Z [INFO] 10.240.0.7:59264 - 48321 "A IN grafana.com.svc.cluster.local. udp 47 false 512" NXDOMAIN qr,aa,rd 149 0.000251287s
2019-07-09T18:53:40.800Z [INFO] 10.240.0.7:43284 - 53604 "AAAA IN grafana.com.compute.internal. udp 46 false 512" NXDOMAIN qr,aa,rd 143 0.000134783s
2019-07-09T18:53:40.801Z [INFO] 10.240.0.7:54951 - 1859 "A IN grafana.com.compute.internal. udp 46 false 512" NXDOMAIN qr,aa,rd 143 0.000084377s
2019-07-09T18:53:58.103Z [INFO] 10.240.1.5:58711 - 25548 "A IN autoscaling.us-west-2.amazonaws.com.kube-system.svc.cluster.local. udp 83 false 512" NXDOMAIN qr,aa,rd 176 0.000390035s

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

The logs appear to be incomplete (missing part of 2nd and all of 3rd query) ... nevertheless...
The following log entry should have shown NOERROR, if cb-test1 was ready.

2019-07-09T18:47:29.457Z [INFO] 10.240.1.43:43611 - 63761 "A IN cb-test1.default.svc.cluster.local. udp 52 false 512" NXDOMAIN qr,aa,rd 145 0.000143867s

What's the output of kubectl get endpoints?

What version of CoreDNS did you compile with, and what version of kubernetes are you using?

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver,
I am not so good with Corefile yet.
I have a gut feeling that I might have not set the kubernetai plugin correctly.
Request you to please check once in my first post above. This setup is using Infoblox DNS not Route53.

k8s version: v1.14.3 (both the clusters are using the same v1.14.3 of kubernetes)
Test-1 CoreDNS Version: 1.4.0 (chrisohaver/coredns:1.4.0-6f5b294-kubernetai) (failing the same way)
Test-2 CoreDNS version: 1.5.0 (bjethwan/coredns:1.5.0-kubernetai) (failing the same way)

k get endpoints
NAME                           ENDPOINTS                                                                 AGE
cb-test1                       10.223.33.149:18091,10.223.36.104:18091,10.223.33.149:8091 + 1 more...    12d
cb-test1-srv                   10.223.33.149:11210,10.223.36.104:11210,10.223.33.149:11207 + 1 more...   12d
cb-test1-ui                    10.223.33.149:8091,10.223.36.104:8091,10.223.33.149:18091 + 1 more...     12d
couchbase-operator             <none>                                                                    14d
couchbase-operator-admission   10.223.36.48:8443                                                         12d
kubernetes                     10.223.32.35:443,10.223.33.252:443,10.223.37.183:443                      14d
nginxd                         10.223.37.189:80                                                          6d16h

I tried below to get the logs again.

dnstools# dig cb-test1

; <<>> DiG 9.11.3 <<>> cb-test1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 34401
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;cb-test1.			IN	A

;; AUTHORITY SECTION:
.			10	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2019070901 1800 900 604800 86400

;; Query time: 99 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Wed Jul 10 04:00:54 UTC 2019
;; MSG SIZE  rcvd: 112

dnstools# 
dnstools# 
dnstools# dig cb-test1.default.svc.cluster.local

; <<>> DiG 9.11.3 <<>> cb-test1.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 37494
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: e1d9822c72f4936f (echoed)
;; QUESTION SECTION:
;cb-test1.default.svc.cluster.local. IN	A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1562646786 7200 1800 86400 5

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Wed Jul 10 04:01:22 UTC 2019
;; MSG SIZE  rcvd: 168

dnstools# 
dnstools# dig nginxd.default.svc.cluster.local

; <<>> DiG 9.11.3 <<>> nginxd.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 63907
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 8c231d5ee75a6bf4 (echoed)
;; QUESTION SECTION:
;nginxd.default.svc.cluster.local. IN	A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1562646786 7200 1800 86400 5

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Wed Jul 10 04:01:37 UTC 2019
;; MSG SIZE  rcvd: 166

dnstools# 
dnstools# dig kubernetes.default.svc.cluster.local

; <<>> DiG 9.11.3 <<>> kubernetes.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26986
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 4dd72bd7f50207f6 (echoed)
;; QUESTION SECTION:
;kubernetes.default.svc.cluster.local. IN A

;; ANSWER SECTION:
kubernetes.default.svc.cluster.local. 5	IN A	10.233.0.1

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Wed Jul 10 04:01:55 UTC 2019
;; MSG SIZE  rcvd: 129

dnstools# exit

Pod-1

2019-07-10T04:01:02.486Z [INFO] 10.240.1.5:38258 - 38006 "A IN autoscaling.us-west-2.amazonaws.com.compute.internal. udp 70 false 512" NXDOMAIN qr,aa,rd 167 0.00011861s
2019-07-10T04:01:02.489Z [INFO] 10.240.1.5:54740 - 15088 "A IN autoscaling.us-west-2.amazonaws.com. udp 53 false 512" NOERROR qr,rd,ra 907 0.002004799s
2019-07-10T04:01:26.854Z [INFO] 10.240.1.44:58454 - 37494 "A IN cb-test1.default.svc.cluster.local. udp 75 false 4096" NXDOMAIN qr,aa,rd 145 0.000173265s
2019-07-10T04:01:42.201Z [INFO] 10.240.1.44:41079 - 63907 "A IN nginxd.default.svc.cluster.local. udp 73 false 4096" NXDOMAIN qr,aa,rd 143 0.000187836s
2019-07-10T04:01:59.734Z [INFO] 10.240.1.44:46502 - 26986 "A IN kubernetes.default.svc.cluster.local. udp 77 false 4096" NOERROR qr,aa,rd 106 0.000148679s

Pod-2
2019-07-10T04:00:54.617Z [INFO] 10.240.1.44:34154 - 34401 "A IN cb-test1. udp 49 false 4096" NXDOMAIN qr,rd,ra 112 0.099616315s

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Request you to please check once in my first post above.

I already had reviewed when you first posted it. It looks OK. There are some issues with it, but those would not cause the issue you are experiencing, so I didn't want to muddy the water by pointing them out.

I can't explain why you are seeing this intermittent issue. Can you clarify the scenario of failure again. Initially you said it would work fine for some time (how long?), and then it would start acting inconsistently. What was the rate of failure that you saw at that time?
Later you say "It's completely unpredictable now.". Not sure what you mean by that. Did it start failing more often? In what way was it predictable before?

A couple of troubleshooting steps that may or may not help shed light:

  • If you can, try reducing the number CoreDNS replicas to one. It could be possible that one instance is OK, and the other is not, hence the unpredictable behavior. A single instance also means a single log, which is easier to troubleshoot.
  • Try disabling cache for cluster.local to see if changes the behavior at all.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Hi @chrisohaver

I will debug the issue and let you know.

I am relatively new to CoreDNS. Please let me know any glaring config issues, so I will learn and improve.

I called it unpredictable, as you can see from below output that when I run the host command in quick successions, the output varies. In below case, initially it was not giving tghe IP for remote dns entries, then I deleted the coredns pods and saw it working perfectly fine using dnstools pod.
And then in a matter of 10-15 mins ....I exited the previous dnstools pod and recreated a new one and got below output.

$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# 
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.33.149
cb-test1.default.svc.cluster.local has address 10.223.36.104
Host cb-test1.default.svc.cluster.local not found: 3(NXDOMAIN)
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.33.149
cb-test1.default.svc.cluster.local has address 10.223.36.104
Host cb-test1.default.svc.cluster.local not found: 3(NXDOMAIN)
Host cb-test1.default.svc.cluster.local not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# 
dnstools# host cb-test1
Host cb-test1 not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.33.149
cb-test1.default.svc.cluster.local has address 10.223.36.104
Host cb-test1.default.svc.cluster.local not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# 
dnstools# host cb-test1
Host cb-test1 not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# host cb-test1
Host cb-test1 not found: 3(NXDOMAIN)
dnstools# 
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.33.149
cb-test1.default.svc.cluster.local has address 10.223.36.104

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

Just FYI: Disabling the cache didn't help.

  Corefile: |
    # Kubernetes Services (cluster.local domain)
    10.233.0.0/16 10.240.0.0/12 cluster.local {
      prometheus :9153
      errors
      log
      #cache 10
      template IN ANY net.svc.cluster.local com.svc.cluster.local org.svc.cluster.local internal.svc.cluster.local {
        rcode NXDOMAIN
        authority "{{ .Zone }} 60 IN SOA ns.coredns.cluster.local coredns.cluster.local (1 60 60 60 60)"
      }
      kubernetai {
        fallthrough
      }
      kubernetai {
        endpoint https://10.223.32.35:443
        tls /etc/k8sflashcerts/client.crt /etc/k8sflashcerts/client.key /etc/k8sflashcerts/ca.crt
        upstream
      }
    }

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

OK, so it looks like after about 10-15 minutes, it starts failing about 50% of the time.
That hints that after 10-15 minutes, one of the CoreDNS instances has a problem with the k8s api client connection to the remote cluster, and starts returning NXDOMAIN. The other instance continues to work fine hence a 50% fail rate. Dropping down to one CoreDNS instance would verify if this is the case.

I presume that you see no errors in the CoreDNS logs, so it's failing silently, which makes it hard to debug.

There could perhaps be errors in the kubernetes API pod logs of the remote cluster.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

How many nodes are in your local cluster?

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Local k8s
1 master + 2 workers

Remote k8s
3 masters + 3 workers

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Instead of reducing replicas to 1, you could test one pod directly using the IPs of the CoreDNS pods... e.g.

dig @coredns-pod-ip-address cb-test1.default.svc.cluster.local.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
Thanks Chris !

I got serverfail with one pod and nxdomain with another.
Let me check the API server on the other cluster, my infra team gave me a HA cluster but they didn't give a loadbalancer url but ip to one of the masters (kube-apiserver). I will . check and come back shortly.

dnstools# dig @10.240.2.23 cb-test1.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> @10.240.2.23 cb-test1.default.svc.cluster.local.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41214
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 0066403744f60448 (echoed)
;; QUESTION SECTION:
;cb-test1.default.svc.cluster.local. IN	A

;; Query time: 1 msec
;; SERVER: 10.240.2.23#53(10.240.2.23)
;; WHEN: Wed Jul 10 17:23:44 UTC 2019
;; MSG SIZE  rcvd: 75

dnstools# 
dnstools# 
dnstools# dig @10.240.0.13 cb-test1.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> @10.240.0.13 cb-test1.default.svc.cluster.local.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 9269
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 6b2db5f29742f730 (echoed)
;; QUESTION SECTION:
;cb-test1.default.svc.cluster.local. IN	A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1562768686 7200 1800 86400 5

;; Query time: 0 msec
;; SERVER: 10.240.0.13#53(10.240.0.13)
;; WHEN: Wed Jul 10 17:24:36 UTC 2019
;; MSG SIZE  rcvd: 168

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

SERVFAIL is a new response in this issue. What do the coredns logs look like around that event? Any errors? API failures?

Since you have > 1 node, we should try to determine if the problems are only occurring for coredns instances running on a specific set of nodes.

You can see which nodes the coredns instances are running on with ...

kubectl -n kube-system get pods -o wide -l k8s-app=kube-dns

... which will also show the IP address of each coredns pod.

You can then test the pods individually via IP to see which ones fails, and by noting which nodes they are on, see if there is a pattern.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
Hi Chris, I am still struck.
I was thinking that since I am not using a load-balancer in-front of multiple masters for the remote HA k8s cluster, it's causing an issue for Kubernetai. But that's not the case, I created a single master k8s and I ran into the same issue...

I have attached the logs coredns & new remote kube-apiserver.
Chris, Would you like to take a look into this together over a webex?

logs.zip

Initially

k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# 
dnstools# 
dnstools# 
dnstools# host bipind
bipind.default.svc.cluster.local has address 10.233.250.38
dnstools# 
dnstools# host nginxd
nginxd.default.svc.cluster.local has address 10.233.251.220
dnstools# 
dnstools# 
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.33.149
cb-test1.default.svc.cluster.local has address 10.223.36.104

2019-07-12T06:03:54.158Z [INFO] 10.240.2.26:37744 - 46566 "A IN bipind.default.svc.cluster.local. udp 50 false 512" NOERROR qr,aa,rd 98 0.000140166s
2019-07-12T06:03:54.160Z [INFO] 10.240.2.26:46398 - 43055 "MX IN bipind.default.svc.cluster.local. udp 50 false 512" NOERROR qr,aa,rd 143 0.000107953s

After around 20mins

dnstools# dig bipind

; <<>> DiG 9.11.3 <<>> bipind
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 34173
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;bipind.				IN	A

;; AUTHORITY SECTION:
.			10	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2019071200 1800 900 604800 86400

;; Query time: 1 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Fri Jul 12 06:48:20 UTC 2019
;; MSG SIZE  rcvd: 110
2019-07-12T06:36:03.935Z [INFO] 10.240.2.27:56518 - 15604 "A IN bipind.default.svc.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000127521s
2019-07-12T06:36:03.936Z [INFO] 10.240.2.27:41577 - 16155 "A IN bipind.svc.cluster.local. udp 42 false 512" NXDOMAIN qr,aa,rd 135 0.000089537s
2019-07-12T06:36:03.938Z [INFO] 10.240.2.27:34744 - 44089 "A IN bipind.compute.internal. udp 41 false 512" NXDOMAIN qr,aa,rd 138 0.000167164s
2019-07-12T06:36:04.038Z [INFO] 10.240.2.27:60757 - 39292 "A IN bipind. udp 24 false 512" NXDOMAIN qr,rd,ra 99 0.099090649s 

Remote k8s cluster - pods in kube-system namespace

cluster-autoscaler-577797b746-jgrtf                                   1/1     Running   0          39h
coredns-56bc6b976d-2fvlh                                              1/1     Running   0          39h
coredns-56bc6b976d-tjwks                                              1/1     Running   0          39h
default-http-backend-2t8kf                                            1/1     Running   0          39h
default-http-backend-fdxrb                                            1/1     Running   0          39h
default-http-backend-jgzx5                                            1/1     Running   0          39h
dns-autoscaler-7668758848-2cs4h                                       1/1     Running   0          39h
heapster-v1.6.0-beta.1-84d5f45946-cj8dr                               4/4     Running   0          39h
kube-apiserver-ip-10-223-32-126.us-west-2.compute.internal            1/1     Running   1          39h
kube-controller-manager-ip-10-223-32-126.us-west-2.compute.internal   1/1     Running   0          39h
kube-flannel-5p8rp                                                    2/2     Running   0          39h
kube-flannel-h576d                                                    2/2     Running   0          39h
kube-flannel-t6jh9                                                    2/2     Running   0          39h
kube-proxy-5l7zn                                                      1/1     Running   0          39h
kube-proxy-rzggz                                                      1/1     Running   0          39h
kube-proxy-tfwj7                                                      1/1     Running   0          39h
kube-scheduler-ip-10-223-32-126.us-west-2.compute.internal            1/1     Running   0          39h
kubernetes-dashboard-74fd98b6f4-h48kw                                 1/1     Running   0          39h
metrics-server-6cc87dfc94-q7gpl                                       2/2     Running   0          39h
monitoring-influxdb-grafana-v4-86b577876b-5xdc4                       2/2     Running   0          39h
netdata-4btw8                                                         1/1     Running   0          39h
netdata-9swkr                                                         1/1     Running   0          39h
nginx-ingress-lb-7df6cc9c94-dlhpc                                     1/1     Running   0          39h
nginx-ingress-lb-7df6cc9c94-flqwl                                     1/1     Running   0          39h
nginx-ingress-lb-7df6cc9c94-rgpbk                                     1/1     Running   0          39h
nginx-proxy-ip-10-223-32-114.us-west-2.compute.internal               1/1     Running   0          39h
nginx-proxy-ip-10-223-36-229.us-west-2.compute.internal               1/1     Running   0          39h
overprovisioning-6dfdddb484-gs4g9                                     1/1     Running   0          39h
overprovisioning-autoscaler-5c64bd67bd-7vqhm                          1/1     Running   0          39h
placeholder-pod-s4mcd                                                 1/1     Running   0          39h
placeholder-pod-vkqhj                                                 1/1     Running   0          39h
scheduled-scaler-679447cb8b-4rz79                                     1/1     Running   0          39h
splunkuniversalforwarder-kktvd                                        1/1     Running   0          39h
splunkuniversalforwarder-n9xr8                                        1/1     Running   0          39h
tiller-deploy-7bfd798744-htr5b                                        1/1     Running   0          39h

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
Maybe you have some pre-baked docker image with enhanced logging around remote apiserver connections?

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
I noticed some interesting error conditions on remote k8s coredns pods, on the second thought, had these been related....how would it work ion the first place??

coredns-remote.log.zip

$ for p in $(kubectl get pods -n=kube-system -l k8s-app=kube-dns -o name); do kubectl logs -n=kube-system $p; done > coredns-remote.log [coredns-remote.log.zip](https://github.com/coredns/kubernetai/files/3385405/coredns-remote.log.zip)

E0710 15:25:13.285534       1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=35, ErrCode=NO_ERROR, debug=""
E0710 15:25:13.286560       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Endpoints: Get https://10.233.0.1:443/api/v1/endpoints?resourceVersion=1403&timeout=6m27s&timeoutSeconds=387&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286583       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Service: Get https://10.233.0.1:443/api/v1/services?resourceVersion=1141&timeout=8m23s&timeoutSeconds=503&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286570       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Namespace: Get https://10.233.0.1:443/api/v1/namespaces?resourceVersion=1042&timeout=6m24s&timeoutSeconds=384&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286625       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Namespace: Get https://10.233.0.1:443/api/v1/namespaces?resourceVersion=1042&timeout=6m0s&timeoutSeconds=360&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286585       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Endpoints: Get https://10.233.0.1:443/api/v1/endpoints?resourceVersion=1403&timeout=6m5s&timeoutSeconds=365&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286667       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Service: Get https://10.233.0.1:443/api/v1/services?resourceVersion=1141&timeout=6m48s&timeoutSeconds=408&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286774       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Endpoints: Get https://10.233.0.1:443/api/v1/endpoints?resourceVersion=1403&timeout=7m51s&timeoutSeconds=471&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286824       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Service: Get https://10.233.0.1:443/api/v1/services?resourceVersion=1141&timeout=9m18s&timeoutSeconds=558&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:13.286935       1 reflector.go:251] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to watch *v1.Namespace: Get https://10.233.0.1:443/api/v1/namespaces?resourceVersion=1042&timeout=6m27s&timeoutSeconds=387&watch=true: dial tcp 10.233.0.1:443: connect: connection refused
E0710 15:25:16.287334       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "namespaces" in API group "" at the cluster scope
E0710 15:25:16.290726       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "services" in API group "" at the cluster scope
E0710 15:25:16.298414       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpoints" in API group "" at the cluster scope
E0710 15:25:16.298449       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "services" in API group "" at the cluster scope
E0710 15:25:16.325539       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "namespaces" in API group "" at the cluster scope: RBAC: [clusterrole.rbac.authorization.k8s.io "system:public-info-viewer" not found, clusterrole.rbac.authorization.k8s.io "system:discovery" not found, clusterrole.rbac.authorization.k8s.io "system:coredns" not found, clusterrole.rbac.authorization.k8s.io "system:basic-user" not found]
E0710 15:25:16.325555       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "namespaces" in API group "" at the cluster scope: RBAC: [clusterrole.rbac.authorization.k8s.io "system:coredns" not found, clusterrole.rbac.authorization.k8s.io "system:basic-user" not found, clusterrole.rbac.authorization.k8s.io "system:public-info-viewer" not found, clusterrole.rbac.authorization.k8s.io "system:discovery" not found]
E0710 15:25:16.328629       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpoints" in API group "" at the cluster scope: RBAC: [clusterrole.rbac.authorization.k8s.io "system:public-info-viewer" not found, clusterrole.rbac.authorization.k8s.io "system:discovery" not found, clusterrole.rbac.authorization.k8s.io "system:coredns" not found, clusterrole.rbac.authorization.k8s.io "system:basic-user" not found]
E0710 15:25:16.332994       1 reflector.go:134] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:95: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpoints" in API group "" at the cluster scope: RBAC: [clusterrole.rbac.authorization.k8s.io "system:public-info-viewer" not found, clusterrole.rbac.authorization.k8s.io "system:discovery" not found, clusterrole.rbac.authorization.k8s.io "system:coredns" not found, clusterrole.rbac.authorization.k8s.io "system:basic-user" not found]

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

I noticed some interesting error conditions on remote k8s coredns pods ...

Those errors appear to show that at 15:25:13.285534, the API closed the connection (server sent GOAWAY), and afterwards the client was not able to reconnect until 15:25:16.287334, at which point it started getting permissions errors from the API. This implies that the API server restarted for some reason - the API server will send the GOAWAY message when it is shutting down. This lines up with the API logs you provided, which begin at same time.

Maybe you have some pre-baked docker image with enhanced logging around remote apiserver connections?

I do not. I'd expect there to be reciprocal errors in the remote api log, but I don't see anything that looks different during the window of failure in the logs you sent.

It's confusing to use different query tools interchangeable during your tests, because they behave differently. To keep things simple, please stick to using one tool, preferably dig.

When using dig, please note that it does not use search domains by default. This means that a query like dig bipind will always fail. Instead you would need to do a query using the fqdn...

dig bipind.default.svc.cluster.local

or, you can tell it to use the search domains, and show the search queries ...

dig bipind +search +showsearch

So, in your example where you show that it succeeded several times, and then failed 20 minutes later, the failure is expected, because the fqdn bipind. does not exist.

Some questions ...

  • Is the remote cluster's CoreDNS working normally? Or is it failing to resolve service queries?
  • Are you continuing to see SERVFAIL responses on the local cluster CoreDNS when querying remote services.
  • Have you verified that when the CoreDNS remote service queries become unpredictable, one of the CoreDNS instances is failing and the other is OK? I am assuming this is the case, but would like to verify.
  • Have you noted any pattern to which Nodes the CoreDNS Pods are running on, and whether or not they (individually) fail?
  • If you leave CoreDNS in the 50% query failure state without restarting it, does it eventually fail for 100% of remote services? Or does it stay in a state of 50% query failure forever. i.e. is only one instance failing, or would both instances eventually fail.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

Hi Chris, Thank you for all the info. And I am sorry for confusion with tools (host vs dig).

Chris,
I was checking the underlying nodes.
Does these versions look too old? Would that cause an issue?

OS-IMAGE                                        KERNEL-VERSION      CONTAINER-RUNTIME
Container Linux by CoreOS 1967.6.0 (Rhyolite)   4.14.96-coreos-r1   docker://18.6.1

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

  • Are you continuing to see SERVFAIL responses on the local cluster CoreDNS when querying remote services.

No.

  • Is the remote cluster's CoreDNS working normally? Or is it failing to resolve service queries?

Yes

  • Have you verified that when the CoreDNS remote service queries become unpredictable, one of the CoreDNS instances is failing and the other is OK? I am assuming this is the case, but would like to verify.

This local k8s has 1 master + 2 worker.
I had removed dns autoscaler, and then scaled down the coredns deployment to single pod (replicas=1).
Then I made sure to run it on all the thee node, but each time, it failed after some time.

  • If you leave CoreDNS in the 50% query failure state without restarting it, does it eventually fail for 100% of remote services? Or does it stay in a state of 50% query failure forever. i.e. is only one instance failing, or would both instances eventually fail.

It works 100% after sometime goes down to 50% and then fails completely.
And then it never recovers.

  • Is the remote cluster's CoreDNS working normally? Or is it failing to resolve service queries?

That is working perfectly fine ...no blip.

dig bipind.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> bipind.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33371
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: f4d7902cae54cffd (echoed)
;; QUESTION SECTION:
;bipind.default.svc.cluster.local. IN	A

;; ANSWER SECTION:
bipind.default.svc.cluster.local. 5 IN	A	10.233.250.38

;; Query time: 1 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Sat Jul 13 18:00:37 UTC 2019
;; MSG SIZE  rcvd: 121

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

I tried log tailing using "-f" option for remote kube-apiserver but nothing useful

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

FYI:
There's no NetworkPolicy or PodSecurityPolicy in k8s. And the AWS NACL allows all protocol, all ports, from all the IPs (0.0.0.0/0) and still no relevant logs in remote k8s kube-apiserver.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Ok, thanks for answering into those questions.

So, remote connections to the K8s API are affected and local in-cluster connections are not.
Pods do not fail at the same time... which suggests that the event that causes a failure is not originating from the remote API itself (otherwise both Pods would fail at the same time, and local connections might be affected too). This could be network related, i.e. something occurring during transport between clusters. Do you see any packet loss between the clusters?

Can you spin up a stock CoreDNS Pod on the local cluster that uses the built in kubernetes plugin instead of the kubernetai plugin? Curious to see if it fails in the same way. I expect it should, since kubernetai is mostly a wrapper around the kubernetes code. But would be good to verify.

Another troubleshooting step to try would be to spin up a Pod running a kubernetai compile of CoreDNS on the remote cluster, and connect to the API as if it was external (i.e. with the endpoint and tls options) to see if fails in the same way. I expect it would not fail.

I can add verbose debugging to the api watch functions to help understand better what's going on.
I can also try updating the k8s API client library with the most recent version to see if that helps.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Hi Chris,

I will sure do what you asked me above.

Meanwhile, Anand in my infra team was helping me to catch the tcpdump (attached) on the node (10.223.35.13) hosting the coredns pod on eth0 for all the traffic going to remote k8s kube-apiserver (10.223.32.126:443) and we could see a lot of packets being transferred.

We also caught the tcpdump of the coredns pod itself wherein we saw the NXDOMAIN (attached).

This was supposed to prove that we have the network connectivity from coredns to remote kube-apiserver. And when we scaled the coredns deployment to 0 replicas, everything was silent.

tcpdump for kubernetai.zip

And at the same time Anand got me to change the Corefile to match as my working case as below.
As shown below...only one server block now.

  Corefile: |-
    .:53 {
        errors
        health
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          pods insecure
          upstream
          fallthrough cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.126
          tls /etc/k8sflash1certs/client.crt /etc/k8sflash1certs/client.key /etc/k8sflash1certs/ca.crt
          pods insecure
          upstream
          fallthrough cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.35
          tls /etc/k8sflashcerts/client.crt /etc/k8sflashcerts/client.key /etc/k8sflashcerts/ca.crt
          pods insecure
          upstream
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        loop
        cache 30
        loadbalance
        reload
    }

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

What's the reasoning behind changing the Corefile?

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Hi @chrisohaver

What's the reasoning behind changing the Corefile?

We were trying to narrow our scope.
We wanted to be sure that it's not something in our custom config causing this issue.

This could be network related, i.e. something occurring during transport between clusters. Do you see any packet loss between the clusters?

Would AWS VPC Flow logs help?

I can add verbose debugging to the api watch functions to help understand better what's going on.

That would definitely help !
We can do this together over a working web conf. session, if you like, as we can try it out at the same time. I found it failing in 5mins of time. We are suspecting timeout of some type, as it always starts working fine when we restart the coredns pod(s) on local k8s cluster.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Thanks. Have you had a chance to test the two scenarios I suggested above?
Knowing how those behave will help narrow down the issue.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

Hi Chris,

I think I got the issue....
I created a k8s cluster using kubespray and it defaulted the cni to flannel, but I wanted to run with amazon-vpc-cni-k8s cni plugin so I applied their yaml file. amazon-vpc-cni was working fine as the new pods came with VPC IP addresses. But lately, I saw the kube-flannel pods still hanging around. And everything started working fine when I deleted kube-flannel daemonset.
This was tricky to debug as there was no clue either in kube-apiserver or coredns logs.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Ah - interesting - so two network plugins conflicting with each other.
Thanks for reporting back. I'll add this to my list of things to check for when debugging.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
This issue still persists. It is still haunting us.
Compared to before...it showed up after a couple of hours (when previously it was happening after 3-5mins)

On the remote k8s kube-apiserver logs I noticed below error

E0722 19:06:12.715338       1 controller.go:148] Unable to remove old endpoints from kubernetes service: StorageError: key not found, Code: 1, Key: /registry/masterleases/10.223.32.238, ResourceVersion: 0, AdditionalErrorMsg: 
W0722 19:06:14.061470       1 lease.go:222] Resetting endpoints for master service "kubernetes" to [10.223.32.230 10.223.32.238]

However, I failed to understand as to how can it work for hours.
When it fails there's nothing logged in remote k8s coredns pods.

I am going to try using external endpoint in remote k8s CoreDNS/Corefile (as you asked in previous post).

One quick question: Do we need to recreate coredns service when we applied amazon-vpc-cni and removed flannel?

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

To narrow down our scope....I have created another vanilla k8s cluster with default kube-flannel.
This way will know if it was amazon-vpc-cni plugin creating the issue.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
It's not amazon-vpc-cni plugin.
I am facing the exact same issue with kube-flannel.
I am trying your suggestion of making CoreDNS/Kubernetes plugin connect using external interface.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver making coredns/kubernetes plugin use endpoint is working fine

TEST RESULTS
Making CoreDNS/kubernetes plugin use external interface for connecting to local kube-apiserver

$ export KUBECONFIG=.kube/k8sflash2-flannel.conf 

$ kubectl -n kube-system edit cm coredns

      kubernetes {
        endpoint https://10.223.32.161:443
        tls /etc/k8sflash2flannelcerts/client.crt /etc/k8sflash2flannelcerts/client.key /etc/k8sflash2flannelcerts/ca.crt
        upstream
      }

Initial Test Run

$ kubectl get svc
NAME               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
k8sflash2flannel   ClusterIP   10.233.41.113   <none>        80/TCP    13h
kubernetes         ClusterIP   10.233.0.1      <none>        443/TCP   16h

$ kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME                       READY   STATUS    RESTARTS   AGE
coredns-5cf8b7d498-7srlc   1/1     Running   0          51m
coredns-5cf8b7d498-xfb59   1/1     Running   0          51m

$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# 
dnstools# dig k8sflash2flannel.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> k8sflash2flannel.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14234
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 60ba6fcf5b7714df (echoed)
;; QUESTION SECTION:
;k8sflash2flannel.default.svc.cluster.local. IN A

;; ANSWER SECTION:
k8sflash2flannel.default.svc.cluster.local. 5 IN A 10.233.41.113

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 05:22:03 UTC 2019
;; MSG SIZE  rcvd: 141

After some time, this is when things have started failing on other k8s cluster fetching dns records using coredns/kubernetai has started failing


$ kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME                       READY   STATUS    RESTARTS   AGE
coredns-5cf8b7d498-7srlc   1/1     Running   0          140m
coredns-5cf8b7d498-xfb59   1/1     Running   0          140m

$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# 
dnstools# dig k8sflash2flannel.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> k8sflash2flannel.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64857
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 15381ff084f011c8 (echoed)
;; QUESTION SECTION:
;k8sflash2flannel.default.svc.cluster.local. IN A

;; ANSWER SECTION:
k8sflash2flannel.default.svc.cluster.local. 5 IN A 10.233.41.113

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 06:50:34 UTC 2019
;; MSG SIZE  rcvd: 141

And this is the result on other cluster running coredns/kubernetai plugin to replicate the dns records from above cluster....THIS STARTED FAILING AFTER SOME TIME.

Config

$ export KUBECONFIG=.kube/k8szen.conf 

$ kubectl -n kube-system edit cm coredns
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          pods insecure
          upstream
          fallthrough cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.161:443
          tls /etc/k8sflash2flannelcerts/client.crt /etc/k8sflash2flannelcerts/client.key /etc/k8sflash2flannelcerts/ca.crt
          pods insecure
          upstream
          fallthrough cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.230:443
          tls /etc/k8sflashnewcerts/client.crt /etc/k8sflashnewcerts/client.key /etc/k8sflashnewcerts/ca.crt
          pods insecure
          upstream
          fallthrough  in-addr.arpa ip6.arpa
        }

Initial Test

kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME                       READY   STATUS    RESTARTS   AGE
coredns-68d57ff477-bkwp5   1/1     Running   1          68m



$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# 
dnstools# 

dnstools# dig k8sflash2flannel.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> k8sflash2flannel.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43907
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 620b85d368038abd (echoed)
;; QUESTION SECTION:
;k8sflash2flannel.default.svc.cluster.local. IN A

;; ANSWER SECTION:
k8sflash2flannel.default.svc.cluster.local. 5 IN A 10.233.41.113

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 05:23:02 UTC 2019
;; MSG SIZE  rcvd: 141



dnstools# dig cb-test1.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> cb-test1.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7948
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 32c919ac326220d0 (echoed)
;; QUESTION SECTION:
;cb-test1.default.svc.cluster.local. IN	A

;; ANSWER SECTION:
cb-test1.default.svc.cluster.local. 5 IN A	10.223.36.77
cb-test1.default.svc.cluster.local. 5 IN A	10.223.32.216

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 05:41:38 UTC 2019
;; MSG SIZE  rcvd: 175

—AFTER SOME TIME....STARTED FAILING

$ kubectl -n kube-system get pods -l k8s-app=kube-dns
NAME                       READY   STATUS    RESTARTS   AGE
coredns-68d57ff477-bkwp5   1/1     Running   1          159m

$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# dig k8sflash2flannel.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> k8sflash2flannel.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 20097
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: e182cd7d27a68707 (echoed)
;; QUESTION SECTION:
;k8sflash2flannel.default.svc.cluster.local. IN A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1564031026 7200 1800 86400 5

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 06:55:34 UTC 2019
;; MSG SIZE  rcvd: 176


dnstools# dig cb-test1.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> cb-test1.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 57236
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: b1564a9452659292 (echoed)
;; QUESTION SECTION:
;cb-test1.default.svc.cluster.local. IN	A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1564031026 7200 1800 86400 5

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 06:55:57 UTC 2019
;; MSG SIZE  rcvd: 168

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

I can add verbose debugging to the api watch functions to help understand better what's going on.

@chrisohaver - Can you please help to write a few log lines and we will take it from there?

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Can you please help to write a few log lines and we will take it from there?

Where should they go?

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

I'm trying to understand the tests you ran. From your descriptions, it's hard to tell on which cluster CoreDNS was running, and to which cluster they were connecting. They don't seem to line up with the test scenarios I suggested earlier. But it could be confusion caused by the use of the terms "local" , "remote" and "other", which are ambiguous because they are relative terms. e.g. If something is "running on the remote cluster, and connects to the local api", is that the api local to the remote cluster, or the api of the local cluster, which is remote to the remote cluster?

Can you describe each test you ran referring to the clusters as "primary" and "secondary". The primary cluster would be the one you have been calling local, the secondary cluster would be the one you have called remote. I think this would be easier to understand.

Using this language, the suggested tests I intended to convey here are:

  • Create a CoreDNS Pod on the primary cluster that uses the built in kubernetes plugin to connect to the secondary cluster's API. e.g.
      kubernetes cluster.local. in-addr.arpa ip6.arpa {
        endpoint https://{secondary-api}:443
        tls /etc/k8sflash2flannelcerts/client.crt /etc/k8sflash2flannelcerts/client.key /etc/k8sflash2flannelcerts/ca.crt
      }
  • Create a CoreDNS Pod on the secondary cluster that uses a single instance of kubernetai to connect only to the secondary cluster's API as if it was external (i.e. with the endpoint and tls options). e.g.
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://{secondary-api}:443
          tls /etc/k8sflash2flannelcerts/client.crt /etc/k8sflash2flannelcerts/client.key /etc/k8sflash2flannelcerts/ca.crt
        }

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Can you please help to write a few log lines and we will take it from there?

Where should they go?

@chrisohaver I was referring to the "the api watch functions" you talked in your comment above.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

Can you spin up a stock CoreDNS Pod on the local cluster that uses the built in kubernetes plugin instead of the kubernetai plugin? Curious to see if it fails in the same way. I expect it should, since kubernetai is mostly a wrapper around the kubernetes code. But would be good to verify.

@chrisohaver
I am sorry for spamming you with all the "test results" without proper context.
I am going to rewrite this with proper context information in a bit.

Meanwhile, I wanted to let you know that I was trying to run the same test which you had asked me in one of your previous posts. I have quoted from that post here.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

Using the convention you provided:
"primary" is the cluster using CoreDNS/Kubernetai plugin to get the DNS records from "secondary"
Both are in the same VPC but in different subnets, using the same security-group and the Network ACLs allowing ALL network traffic among the subnets. Both k8s clusters are using kube-flannel cni.
Both use CoreDNS 1.5.0. "primary" is running a custom image that I build taking help from you with Kubernetai enabled. While secondary is using vanilla CoreDNS image.

Now, on the "secondary" k8s cluster, I changed the CoreDNS/kubernetes plugin config to use the "endpoint" and "tls" to connect to its own k8s cluster's kube-apiserver. And for testing, I created an nginx pod based service called k8sflash2flannel for. The dig command using dnstools pod on this k8s cluster always returned the correct A record, even when "primary" was reporting NXDOMAIN.

As shown below:

$ export KUBECONFIG=.kube/k8sflash2-flannel.conf 

$ kubectl -n kube-system edit cm coredns

      kubernetes {
        endpoint https://10.223.32.161:443
        tls /etc/k8sflash2flannelcerts/client.crt /etc/k8sflash2flannelcerts/client.key /etc/k8sflash2flannelcerts/ca.crt
        upstream
      }

$ kubectl get svc
NAME               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
k8sflash2flannel   ClusterIP   10.233.41.113   <none>        80/TCP    13h
kubernetes         ClusterIP   10.233.0.1      <none>        443/TCP   16h  

Now coming to the "primary" k8s cluster which has to get the dns records from "secondary" using CoreDNS/Kubernetai plugin. The dig command using dnstools pod on this "primary" k8s cluster kept returning the A record for more than a 2hrs but after that, it started returning NXDOMAIN. And it is still in the same error state. And just to bring things in perspective, below is the Corefile config used in "primary":

$ export KUBECONFIG=.kube/k8szen.conf 

$ kubectl -n kube-system edit cm coredns
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          pods insecure
          upstream
          fallthrough cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.161:443
          tls /etc/k8sflash2flannelcerts/client.crt /etc/k8sflash2flannelcerts/client.key /etc/k8sflash2flannelcerts/ca.crt
          pods insecure
          upstream
          fallthrough cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.230:443
          tls /etc/k8sflashnewcerts/client.crt /etc/k8sflashnewcerts/client.key /etc/k8sflashnewcerts/ca.crt
          pods insecure
          upstream
          fallthrough  in-addr.arpa ip6.arpa
        }

Kindly let me know if you need any further details.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Thanks, that's more clear now. Can you test the second scenario I suggested?

Create a CoreDNS Pod on the secondary cluster that uses a single instance of kubernetai to connect only to the secondary cluster's API as if it was external (i.e. with the endpoint and tls options).

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
Sure Chris. I have made the changes on secondary cluster (CoreDNS/kubernetes to CoreDNS/kubernetai). I will monitor it for a bit. At the same time I have re-started the coredns pond in "primary" so I will monitor till "primary" start reporting NXDOMAIN.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

It failed on "primary" k8s with same NXDOMAIN response
It is working fine on the "secondary" k8s cluster.

"seconday"

$ export KUBECONFIG=.kube/k8sflash2-flannel.conf 
$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools

If you don't see a command prompt, try pressing enter.

dnstools# dig k8sflash2flannel.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> k8sflash2flannel.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10775
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: e2452cd457e4675f (echoed)
;; QUESTION SECTION:
;k8sflash2flannel.default.svc.cluster.local. IN A

;; ANSWER SECTION:
k8sflash2flannel.default.svc.cluster.local. 5 IN A 10.233.41.113

;; Query time: 0 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 16:34:56 UTC 2019
;; MSG SIZE  rcvd: 141

"primary"

$ export KUBECONFIG=.kube/k8szen.conf 
[dbj151v@d010108053116 ~]$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
dnstools# dig k8sflash2flannel.default.svc.cluster.local.

; <<>> DiG 9.11.3 <<>> k8sflash2flannel.default.svc.cluster.local.
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 64407
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: fac2a5b405b9378d (echoed)
;; QUESTION SECTION:
;k8sflash2flannel.default.svc.cluster.local. IN A

;; AUTHORITY SECTION:
cluster.local.		5	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1564068258 7200 1800 86400 5

;; Query time: 1 msec
;; SERVER: 10.233.0.3#53(10.233.0.3)
;; WHEN: Thu Jul 25 16:35:16 UTC 2019
;; MSG SIZE  rcvd: 176

from kubernetai.

johnbelamaric avatar johnbelamaric commented on June 1, 2024

So, that may suggest some shared state between kubernetai plugins that is confusing things.

While experimenting for #3026, I some logs to k8s setup:

diff --git a/plugin/kubernetes/setup.go b/plugin/kubernetes/setup.go
index d97bc913..b7e3b455 100644
--- a/plugin/kubernetes/setup.go
+++ b/plugin/kubernetes/setup.go
@@ -61,10 +61,12 @@ func setup(c *caddy.Controller) error {
                return plugin.Error("kubernetes", err)
        }

+
        err = k.InitKubeCache()
        if err != nil {
                return plugin.Error("kubernetes", err)
        }
+       log.Warningf("KUBERNETES: Initialized @ %v: %v", &k, k)

        k.RegisterKubeCache(c)

@@ -87,6 +89,7 @@ func (k *Kubernetes) RegisterKubeCache(c *caddy.Controller) {
                        select {
                        case <-ticker.C:
                                if k.APIConn.HasSynced() {
+               log.Warningf("KUBERNETES: APIConn is running: %v", k.APIConn)
                                        return nil
                                }
                        case <-timeout:
jbelamaric@jbelamaric:~/proj/gh/coredns/coredns$

and created a Corefile that resulted in the k8s plugin being setup multiple times. I believe these are separate plugin instances with their own APIConns:

jbelamaric@jbelamaric:~/proj/gh/coredns/coredns$ ./coredns
2019-07-25T10:16:22.737-07:00 [WARNING] plugin/kubernetes: KUBERNETES: Initialized @ 0xc0001ee0a0: &{<nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] 0x2beb898 []    0xc0001ec780 0xc000082f00 map[] disabled false {[]} 5 {false true false <nil> <nil> <nil> <nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false} 3 0x13f7ff0 [corp.google.com. prod.google.com. prodz.google.com. google.com. svl.corp.google.com.] []}
2019-07-25T10:16:22.739-07:00 [WARNING] plugin/kubernetes: KUBERNETES: Initialized @ 0xc0001ee290: &{<nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] 0x2beb898 []    0xc000153f90 0xc0003441e0 map[] disabled false {[]} 5 {false true false <nil> <nil> <nil> <nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false} 3 0x13f7ff0 [corp.google.com. prod.google.com. prodz.google.com. google.com. svl.corp.google.com.] []}
2019-07-25T10:16:22.741-07:00 [WARNING] plugin/kubernetes: KUBERNETES: Initialized @ 0xc0001ee540: &{<nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] 0x2beb898 []    0xc00009d310 0xc0003454a0 map[] disabled false {[]} 5 {false true false <nil> <nil> <nil> <nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false} 3 0x13f7ff0 [corp.google.com. prod.google.com. prodz.google.com. google.com. svl.corp.google.com.] []}
2019-07-25T10:16:22.742-07:00 [WARNING] plugin/kubernetes: KUBERNETES: Initialized @ 0xc0001ee6b0: &{<nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] 0x2beb898 []    0xc0002f8550 0xc00041a780 map[] disabled false {[]} 5 {false true false <nil> <nil> <nil> <nil> [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false} 3 0x13f7ff0 [corp.google.com. prod.google.com. prodz.google.com. google.com. svl.corp.google.com.] []}
2019-07-25T10:16:23.043-07:00 [WARNING] plugin/kubernetes: KUBERNETES: APIConn is running: &{1564074982 0xc00018c000 <nil> <nil> 0xc00031ae80 <nil> 0xc00031af80 0xc00031b000 0xc00000d0e0 <nil> 0xc00000d140 0xc00000d1a0 {0 0} false 0xc000322420 [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false}
2019-07-25T10:16:23.143-07:00 [WARNING] plugin/kubernetes: KUBERNETES: APIConn is running: &{1564074983 0xc000229b80 <nil> <nil> 0xc00048fc00 <nil> 0xc00048fc80 0xc00048fd00 0xc00044bd00 <nil> 0xc00044bd60 0xc00044bdc0 {0 0} false 0xc000322540 [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false}
2019-07-25T10:16:23.243-07:00 [WARNING] plugin/kubernetes: KUBERNETES: APIConn is running: &{1564074983 0xc00032b900 <nil> <nil> 0xc0004cea00 <nil> 0xc0004cea80 0xc0004ceb00 0xc0002bb0e0 <nil> 0xc0002bb140 0xc0002bb1a0 {0 0} false 0xc000322660 [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false}
2019-07-25T10:16:23.343-07:00 [WARNING] plugin/kubernetes: KUBERNETES: APIConn is running: &{1564074983 0xc000468b40 <nil> <nil> 0xc000439480 <nil> 0xc000439500 0xc000439580 0xc000442500 <nil> 0xc000442560 0xc0004425c0 {0 0} false 0xc000322780 [0.0.10.in-addr.arpa. 1.0.10.in-addr.arpa. 2.0.10.in-addr.arpa. cluster.local.] false}
cluster.local.:4000
0.0.10.in-addr.arpa.:4000
1.0.10.in-addr.arpa.:4000
2.0.10.in-addr.arpa.:4000
2019-07-25T10:16:23.344-07:00 [INFO] CoreDNS-1.5.2
2019-07-25T10:16:23.344-07:00 [INFO] linux/amd64, go1.12.4, 048987fc-dirty
CoreDNS-1.5.2
linux/amd64, go1.12.4, 048987fc-dirty

However, there is only a single connection to the API server:

jbelamaric@jbelamaric:~$ netstat -anp | grep coredns
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 100.117.29.94:36056     35.232.181.90:443       ESTABLISHED 85556/./coredns
tcp6       0      0 :::4000                 :::*                    LISTEN      85556/./coredns
udp6       0      0 :::4000                 :::*                                85556/./coredns
jbelamaric@jbelamaric:~$

That suggests that the client-go code is sharing some state between these connections....I think.

from kubernetai.

johnbelamaric avatar johnbelamaric commented on June 1, 2024

That said, if it is, then it looks like it is sharing connections with identical connection parameters, which wouldn't apply in the case here.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

So, that may suggest some shared state between kubernetai plugins that is confusing things.

Ah thanks, that's an interesting angle I hadn't considered...

That said, if it is, then it looks like it is sharing connections with identical connection parameters, which wouldn't apply in the case here.

Agreed. And it works for about an hour before it fails. So at least initially, the connections are separate and fully functioning. So if there is any sharing (attempt) going on, it's happening at some sort of reconciliation interval. Seems unusual, but we should try to confirm or rule it out...

@bjethwan, Can you create a CoreDNS Pod on the primary cluster that uses a single instance of kubernetai to connect only to the secondary cluster's API? If this doesn't fail, then there may be some undesirable interaction between concurrent client-go connections.

from kubernetai.

johnbelamaric avatar johnbelamaric commented on June 1, 2024

I also wonder about the fact that we now set resyncPeriod to zero and disable it?

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

I also wonder about the fact that we now set resyncPeriod to zero and disable it?

I have also thought of this, but couldn't figure how it would factor in. @bjethwan tested on both 1.4.0 and 1.5.0 coredns builds, both having the same issue (resyncPeriod defaulted to 0 in 1.5.0).

Also, the working time before failure increased from "3-15 minutes" to "about 2 hours" after the pod networking layer issue was changed (previously flannel and an amazon-vpc-cni-k8s were running at the same time).

Perhaps it's a side effect of connection restoration - that is, things work fine until there is some network event that causes a disconnect, but client-go is unable to reconnect due to bug/confusion over the two concurrent connections.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

I am trapped in some urgency.
Meanwhile, my team has coded a crontab to make CoreDNS reload the same Corefile configmap.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

I'm going to spend some time today to try to reproduce the issue with a debugger.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

For what it's worth, I was unable to reproduce this issue today so I wasn't able to use the debugger to step through exactly what happens during a failure. I used the following set up:

  • Two single node k8s 1.15 clusters running local on my laptop using kind.
  • CoreDNS 1.6.0 built with kubernetai external plugin.
  • CoreDNS running locally (not in a cluster), connected to the two clusters.
  • A script sends queries to CoreDNS every 2 seconds for a service that exists only on the 2nd cluster.

The queries have been running for several hours without failure.

Of course there are many differences between @bjethwan's environment and mine:

  • CoreDNS 1.6.0 has a newer client-go, i.e. my test is using a newer version of the k8s API client.
  • Kubernetes 1.14 vs 1.15 - meaning the api server is newer in my test.
  • CoreDNS is not running from within a cluster in my test, but this should not make a difference.
  • In my test, network traffic never leaves my laptop.
  • Different network plugin: kind clusters use kindnet cni.

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

We are working with @bjethwan on this issue. Here is one observation.

We have three block files/stanzas in corefile in our environment. Accordingly, we see three TCP sessions established from coredns(kubernetai) to three API servers( One local , 2 remote) as below.

It seems like Coredns by default builds its internal data structures by watching kube-api server. At the start, coredns(kubernetai) synchronizes with all the API servers. After sometime it seems to lose sync with remote API servers and at this point the DNS resolution fails as coredns is in sync with local kube API server only, Until an external event like reload(config map change) is triggered. This event is causing coredns(kubernetai) to resynchronize with remote API servers over the established TCP sessions and remote service DNS resolution works. As a workaround we added a cronjob that updates the corefile periodically and ran the test over weekend. We did not see any issue with name resolutions.

ip-10-223-35-13 ~ # cat /proc/net/nf_conntrack | grep 'src=10.240.0.58' | grep ESTABLISHED
ipv4 2 tcp 6 86398 ESTABLISHED src=10.240.0.58 dst=10.233.0.1 sport=60178 dport=443 src=10.223.35.13 dst=10.240.0.58 sport=443 dport=60178 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=0 use=2 -----local connection
ipv4 2 tcp 6 86398 ESTABLISHED src=10.240.0.58 dst=10.223.32.230 sport=45546 dport=443 src=10.223.32.230 dst=10.223.35.13 sport=443 dport=56550 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=0 use=2----remote connection
ipv4 2 tcp 6 86398 ESTABLISHED src=10.240.0.58 dst=10.223.32.161 sport=54444 dport=443 src=10.223.32.161 dst=10.223.35.13 sport=443 dport=58226 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=0 use=2----remote connection

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

Also please note we still see TCP connection with remote API server in established state and tcpdump shows continuous application messages from remote API server to coredns(kubernetai) even when dns resolution does not work. Not sure how reload event makes it work.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@bjethwan, Can you create a CoreDNS Pod on the primary cluster that uses a single instance of kubernetai to connect only to the secondary cluster's API? If this doesn't fail, then there may be some undesirable interaction between concurrent client-go connections

@chrisohaver

IT'S WORKING FOR MORE THAN 7HRS !

Details:

This is from "primary" k8s cluster (i.e the one running kubernetai single stanza pointing to secondary, no local kubernetai stanza)

  Corefile: |-
    .:53 {
        errors
        health
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.230:443
          tls /etc/k8sflashnewcerts/client.crt /etc/k8sflashnewcerts/client.key /etc/k8sflashnewcerts/ca.crt
          pods insecure
          upstream
          fallthrough  in-addr.arpa ip6.arpa
        }
        prometheus :9303
        forward . /etc/resolv.conf
        cache 15
        log
        loop
        loadbalance
        reload
    }

$ kubectl -n kube-system get pods
NAME                                                                 READY   STATUS    RESTARTS      AGE
coredns-68d57ff477-cg6pw                                             1/1     Running   0          **7h49m**
$ kubectl exec -it dnstools -- sh
dnstools# 
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.36.77
cb-test1.default.svc.cluster.local has address 10.223.32.216

Thanks to @anandgubbala and @ibnsinha for setting it up.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@johnbelamaric @chrisohaver

This issue looks very close to what we are facing.
kubernetes/client-go#527

I am going to build a new CoreDNS docker image (kubernetai included) with the latest codebase ( i.e. v1.6.0 release branch) to test.

@anandgubbala @ibnsinha FYI

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@johnbelamaric @chrisohaver

Kindly excuse me if I am wrong (as I am relatively new to k8s) but I couldn't spot a commit bumping up the client-go library in coredns, while I was able to spot upgrade of grpc, aws-sdk-go, dd-trace-go, protobuf, etc

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

This issue looks very close to what we are facing. kubernetes/client-go#527

It is similar sounding, but some details seem contradictory. As you have observed, a single connection is not affected (assuming that the time to failure is consistently ~2 hours). The gist of the findings in that issue suggest the problem is caused by Azure's Load Balancers, which IIUC would affect a single connection too. It doesn't suggest that upgrading client-go would resolve that issue.

I couldn't spot a commit bumping up the client-go library

The commit is here: coredns/coredns#3047.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

This issue looks very close to what we are facing. kubernetes/client-go#527

It is similar sounding, but some details seem contradictory. As you have observed, a single connection is not affected (assuming that the time to failure is consistently ~2 hours). The gist of the findings in that issue suggest the problem is caused by Azure's Load Balancers, which IIUC would affect a single connection too. It doesn't suggest that upgrading client-go would resolve that issue.

I couldn't spot a commit bumping up the client-go library

The commit is here: coredns/coredns#3047.

Thanks Chris !

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

Hi Chris,
I created a CoreDNS image with kubernetai enabled using CoreDNS v1.6.0 tag/branch.
Then I used the new docker image to test.

The issue still persists.

"primary" k8s with two kubernetai stanza (local and remote) works fine for a few minutes and then starts failing for remote. Then when I removed the local stanza and kept just the remote one, it worked fine.

We are out of options and blocked. We need your help !

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

works fine for a few minutes ...

Odd - Now its a few minutes again? Can you please re-characterize the rate of failure? Earlier you had said the issue occurred after 2 hours after having moved to a single CNI, instead of running two at the same time. Perhaps that was just chance, and the time to failure is very random, ranging from 2 minutes to 2 hours, perhaps more. Is the single connection test still running, is it still OK?

from kubernetai.

ibnsinha avatar ibnsinha commented on June 1, 2024

@chrisohaver most of the times it fails in 3 to 5 minutes. With single stanza it is still working fine

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

@chrisohaver most of the times it fails in 3 to 5 minutes. With single stanza it is still working fine

In the instance that it worked for 2 hours, was there any difference in the way it was tested or configured?

from kubernetai.

ibnsinha avatar ibnsinha commented on June 1, 2024

There was no difference. But I remember we had 2 stanzas (local and remote)

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

In the instance that it worked for 2 hours, was there any difference in the way it was tested or configured?

@chrisohaver
@ibnsinha was working with me on the day we noticed cni conflict and on that day it didn't fail in 5 mins and the only difference I guess was that we were chatting and checking the connection using "host" command continually otherwise we normally take a break.

And yes, we didn't make any config changes. And surely not without updating you on this thread.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

works fine for a few minutes ...

Odd - Now its a few minutes again? Can you please re-characterize the rate of failure? Earlier you had said the issue occurred after 2 hours after having moved to a single CNI, instead of running two at the same time. Perhaps that was just chance, and the time to failure is very random, ranging from 2 minutes to 2 hours, perhaps more. Is the single connection test still running, is it still OK?

@chrisohaver
Let me reinstate the local kubernetai stanza and monitor closely every 3mins. I will update you in a bit.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

What is the resync time period with API servers(local and remote)?
Is this sync a pull request from coredns or push from API server.

In 1.4.0, it defaulted to 5 hours.
In 1.5.0, it defaults to 0, which disables the feature, but can be set to something else with the option.
In 1.6.0, it is disabled, and the setting is ignored.

This client-go connection option was originally exposed as CoreDNS setting because we misunderstood its purpose. It's a client-go connection option that is not described well in docs, so I don't know precisely what it does. Here is a google group discussion that sheds some light on it: https://groups.google.com/forum/#!topic/kubernetes-sig-api-machinery/PbSCXdLDno0

Quoting that thread:

A resync is different than a relist. The resync plays back all the events held in the informer cache. A relist hits the API server to re-get all the data.

IIUC, the informer cache is on the client side, so as described, resync is more about resyncing the consumer of the informer on the client side only, perhaps re-triggering client side event callbacks (not a resync of data from the API as one might expect.) If that's right, for CoreDNS this would not do anything important. It would not refresh dns records from informer cache, since we use the informer cache directly to create records on demand for each query.

But it is mysterious. We could be misunderstanding what it does. So, leaving no stone unturned, you could try running the 1.5.0 or build and set resync period to something under 3 minutes.

from kubernetai.

ibnsinha avatar ibnsinha commented on June 1, 2024

@chrisohaver I ran a test by changing the order of the stanzas. I added the remote stanza first and then local stanza in the second like below:

 Corefile: |-
    .:53 {
        errors
        health
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          endpoint https://10.223.32.230:443
          tls /etc/k8sflashnewcerts/client.crt /etc/k8sflashnewcerts/client.key /etc/k8sflashnewcerts/ca.crt
          pods insecure
          upstream
          fallthrough  cluster.local. in-addr.arpa ip6.arpa
        }
        kubernetai cluster.local. in-addr.arpa ip6.arpa {
          pods insecure
          upstream
          fallthrough  in-addr.arpa ip6.arpa
        }
        prometheus :9303
        forward . /etc/resolv.conf
        cache 15
        log
        loop
        loadbalance
        reload
    }

After few minutes it started failing for local cluster but works fine for remote cluster. It proves that the issue is not for the remote cluster but the second kubernetai stanza. It always works fine for the first stanza and fails for rest of the stanzas after sometimes

@bjethwan was not able to test as I was running this test. I think we will not be able to spend more time debugging and we might get pulled off from this project. Could you please enhance the logging for coredns ? I think that would help to fix this issue faster

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
I second @ibnsinha 's finding. I can see it myself that cause of the ordering change now the remote entries are coming up fine but local dns entries which is now the second stanza started failing in a little over 10 mins.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
Now it's a very good time that you see this for yourself once.
Would you like to join a very brief web conf to go over this together?

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Thanks, that's a good data point. That has me thinking that it could be a race condition in how the fallthrough is implemented to be transparent to the underlying kubernetes plugin. This could explain why I didn't reproduce the issue locally in my test, because the dns query load is all serial.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Now it's a very good time that you see this for yourself once.
Would you like to join a very brief web conf to go over this together?

No, I don't think seeing the issue first hand would help. The issue is not that I don't believe you, it's just that "seeing it happen" is not going to give me any more insight than I already have.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

That has me thinking that it could be a race condition in how the fallthrough is implemented to be transparent to the underlying kubernetes plugin.

Yes - Looking at the code, it seems that there is a race condition there. Looking into a fix...

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver
Thanks Chris !
I will build an image from coredns master branch with kubernetai plugin enabled.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

I will build an image from coredns master branch with kubernetai plugin enabled.

It doesnt need to be the coredns master branch. I'd recommend building on a released tag of CoreDNS e.g. v1.6.0. Also, #28 isn't merged yet, so you'll need tp pull the PR locally, and build CoreDNS using your local copy of kubernetai instead of a commit.

I'm trying to reproduce the race locally, so I can confirm the fix works before merging the PR, but I created the PR in case you wanted to test early.

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

@chrisohaver couple of questions on sync time period w..r.t to multiple stanzas in core file.
Please clarify.
Is there a resync time period for coredns(kubernetai) with API servers(local and remote)?
Is this sync a pull request from coredns or push from API server ?
How is the resync period associated with cache period mentioned in the stanzas ?

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

How is the resync period associated with cache period mentioned in the stanzas ?

they are unrelated

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

#28 is merged now, so you should be able to build coredns with kubernetai@master and test.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

The fix in #28 appears to have fixed the issue.

I was able to reproduce the issue without the #28 fix within a few seconds by hammering coredns with two parallel thread of queries. With the fix, the problem does not occur.

@chrisohaver Thank you !

At the moment I am not sure how to make the CoreDNS build pick kubernetai@master, let me try my luck.

from kubernetai.

johnbelamaric avatar johnbelamaric commented on June 1, 2024

@chrisohaver awesome, nice find!

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

@bjethwan, go get github.com/coredns/kubernetai@master

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver Just so I don't make a mistake.....(and I am new to golang build system)
I got below

$ ls -la
total 24
drwxr-xr-x   5 bj151v  341438103   160 Jul  8 15:11 .
drwxr-xr-x  59 bj151v  341438103  1888 Jul  4 10:21 ..
-rw-r--r--@  1 bj151v  341438103  8196 Jul  8 20:45 .DS_Store
drwxr-xr-x  46 bj151v  341438103  1472 Jul 31 02:10 coredns
drwxr-xr-x  12 bj151v  341438103   384 Jul 31 01:52 kubernetai

Now how do I tell CoreDNS build system to use local directory instead of pulling the code from github?

Previously, I used to follow the below steps to build a CoreDNS image.

Edit plugin.cfg file
-kubernetes:kubernetes
+kubernetai:github.com/coredns/kubernetai/plugin/kubernetai

Edit Makefile.release (this is a convenience to name the final image) 
-VERSION:=$(shell grep 'CoreVersion' coremain/version.go | awk '{ print $$3 }' | tr -d '"')
+VERSION:=$(shell grep 'CoreVersion' coremain/version.go | awk '{ print $$3 }' | tr -d '"')-kubernetai

export GO111MODULE=on

go generate 
go build 

Steps for publishing the image on docker hub
export DOCKER_LOGIN=bjethwan
export DOCKER_PASSWORD=<<feed your docker hub password>>
make -f Makefile.release DOCKER=bjethwan release
make -f Makefile.release DOCKER=bjethwan docker
make -f Makefile.release DOCKER=bjethwan docker-push

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Now how do I tell CoreDNS build system to use local directory instead of pulling the code from github?

You don't need to tell the build to use the local directory, you can use the master head commit because #28 is already merged into master.

Your go.mod in coredns should contain a line in the require section like ...

	github.com/coredns/kubernetai v0.0.0-20190730202139-6e48bfb54360

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

@chrisohaver

It is working fine for the last 1hr.

I had created an image from CoreDNS master branch: bjethwan/coredns:1.6.0-kubernetai-master
The stuff in go.mod was matching with what you gave in above post.

We will monitor and update you after a few more hours.

Thanks Chris !!!

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

@chrisohaver
couple of more questions. Please clarify.
Does coredns(kubernetai) create the DNS records at the start by reaching out to all the API servers(local, remote) in corefile?
What happens after the cache expires ? Does it again reach out to the API servers to create DNS records or records are created only when there is a query from a client ?

from kubernetai.

johnbelamaric avatar johnbelamaric commented on June 1, 2024

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

Thanks John.

CoreDNS uses a watch on each API server. That is, it connects to each API server, loads all the relevant Kubernetes resources into memory, and asks the API server to send it any updates to those resources and resource types.

Is this a time period based watch ?

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

Thanks John for the clarifications. One last query regarding fall through race condition fix by chris.
The core file has 2 stanzas in our environment. We see two persistent connections to the servers( local and remote).
Does the fix force the coredns to reach out to both the API server's to reload kubernetes resources into memory when there is a query for either a local or remote service from the client ?
Assumption here is initial resources in memory had a cache expiry and are flushed out.

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

FYI : 5hrs50m and still working

$ ks get pods
NAME                                                                 READY   STATUS    RESTARTS   AGE
coredns-6bd6d499d4-4zftx                                             1/1     Running   0          5h50m
 
$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest bipin-dnstools
If you don't see a command prompt, try pressing enter.
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.32.216
cb-test1.default.svc.cluster.local has address 10.223.36.77
dnstools# 
dnstools# host bipind
bipind.default.svc.cluster.local has address 10.233.104.86
dnstools# 
dnstools# host sinhad
sinhad.default.svc.cluster.local has address 10.233.109.112
dnstools# 
dnstools# host k8sflash1amazonvpccni
k8sflash1amazonvpccni.default.svc.cluster.local has address 10.233.12.160

from kubernetai.

bjethwan avatar bjethwan commented on June 1, 2024

FYI : More than 15hr and working perfectly fine.

$ ks get pods
NAME READY STATUS RESTARTS AGE
coredns-6bd6d499d4-4zftx 1/1 Running 0 15h

$ k run -it --rm --restart=Never --image=infoblox/dnstools:latest bipin-dnstools
If you don't see a command prompt, try pressing enter.
dnstools#
dnstools# host cb-test1
cb-test1.default.svc.cluster.local has address 10.223.36.77
cb-test1.default.svc.cluster.local has address 10.223.32.216
dnstools#
dnstools# host bipind
bipind.default.svc.cluster.local has address 10.233.104.86
dnstools#
dnstools# host sinhad
sinhad.default.svc.cluster.local has address 10.233.109.112
dnstools#
dnstools# host k8sflash1amazonvpccni
k8sflash1amazonvpccni.default.svc.cluster.local has address 10.233.12.160
dnstools#

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

Thanks John for the clarifications. One last query regarding fall through race condition fix by chris.
The core file has 2 stanzas in our environment. We see two persistent connections to the servers( local and remote).
Does the fix force the coredns to reach out to both the API server's to reload kubernetes resources into memory when there is a query for either a local or remote service from the client ?

CoreDNS does not reach out to the API when a query occurs. CoreDNS uses an asynchronous api watch to update a local store of all services and endpoints defined in the cluster. This happens asynchronously from queries. IOW, CoreDNS holds a copy of the data we need to create DNS records for all service and endpoints in memory. When a query occurs, CoreDNS just looks into that local store - no API interaction occurs.
#28 fixes a race condition relating to the fall-through behavior. When the race condition occured, the first stanza permanently had its fallthough configuration disabled, so coredns started sending an NXDOMAIN to the clients instead of falling through to the next stanza when the queried service doesn't exist in the first cluster. IOW, it was as if the bug was permanently removing fallthrough option from the first stanza when the race occurred (not literally from the Corefile, but from the in memory configuration).

Assumption here is initial resources in memory had a cache expiry and are flushed out.

No, during the failure, the watch store/cache of the second server was not expired or flushed out. It was still being maintained and updated by the api watch. It simply wasn't being used, due to the fall-through race bug.

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

@chrisohaver Thanks for the detailed explanation. One last query.
|When the race condition occured, the first stanza permanently had its fallthough configuration disabled, so coredns started sending an NXDOMAIN to the clients instead of falling through to the next stanza when the queried service doesn't exist in the first cluster.

Understood when the race condition occurs fallthrough configuration in first stanza is disabled. Can I ask what is this race condition and what is causing this condition. please clarify.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

What is this race condition and what is causing this condition. please clarify.

It could happen when two queries get processed at the same time. Specifically, two queries for services that do not exist in the local cluster.

from kubernetai.

anandgubbala avatar anandgubbala commented on June 1, 2024

@chrisohaver Thanks Chris.

from kubernetai.

chrisohaver avatar chrisohaver commented on June 1, 2024

I'm closing this as it appears that #28 has fixed the issue. Please re-open if this issue re-occurs.

from kubernetai.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.