Code Monkey home page Code Monkey logo

Comments (20)

hzxuzhonghu avatar hzxuzhonghu commented on June 2, 2024 1

I will take a look

from istio.

zirain avatar zirain commented on June 2, 2024 1

@ymesika separated into #50688

from istio.

ramaraochavali avatar ramaraochavali commented on June 2, 2024

@hzxuzhonghu I remember you changed some thing related to service entry merge recently?

from istio.

j2gg0s avatar j2gg0s commented on June 2, 2024

Maybe it's a side effect of this PR, https://github.com/istio/istio/pull/49573/files#diff-a8e332e06b003dfbf58c29f37de893fbb0e45f642200ea3ea2faa9847fc7e1e0L1426.

I solved this by set different ports[].name

from istio.

hzxuzhonghu avatar hzxuzhonghu commented on June 2, 2024

@j2gg0s is right, from debug/endpointsz, we can verify this

from istio.

ymesika avatar ymesika commented on June 2, 2024

Right, different port names there are no issues (except 1.20.0-1.20.3 which already got resolved #49489)

from istio.

j2gg0s avatar j2gg0s commented on June 2, 2024

My understanding of this.
The actions are primarily caused by changes within the initServiceRegistry.

1.20.4 vs 1.20.3

#49573 has modified this part of the code within the initServiceRegistry.

                shards, ok := env.EndpointIndex.ShardsForService(string(s.Hostname), s.Attributes.Namespace)
                if ok {
-                       ps.ServiceIndex.instancesByPort[svcKey] = shards.CopyEndpoints(portMap)
+                       instancesByPort := shards.CopyEndpoints(portMap)
+                       for port, instances := range instancesByPort {
+                               ps.ServiceIndex.instancesByPort[svcKey][port] = instances
+                       }
                }

The premise for the problem is:
Two ServiceEntry within the same namespace have the same host and portName, but different ports.

So in 1.20.3, we only can see one cluster

1.20.0 vs 1.19.8

#46329 Remove service.InstancesByPort

-               svcKey := s.Key()
-               // Precache instances
+               portMap := map[string]int{}
                for _, port := range s.Ports {
-                       if _, ok := ps.ServiceIndex.instancesByPort[svcKey]; !ok {
-                               ps.ServiceIndex.instancesByPort[svcKey] = make(map[int][]*ServiceInstance)
-                       }
-                       instances := make([]*ServiceInstance, 0)
-                       instances = append(instances, env.InstancesByPort(s, port.Port)...)
-                       ps.ServiceIndex.instancesByPort[svcKey][port.Port] = instances
+                       portMap[port.Name] = port.Port
                }

+               svcKey := s.Key()
+               if _, ok := ps.ServiceIndex.instancesByPort[svcKey]; !ok {
+                       ps.ServiceIndex.instancesByPort[svcKey] = make(map[int][]*IstioEndpoint)
+               }
+               shards, ok := env.EndpointIndex.ShardsForService(string(s.Hostname), s.Attributes.Namespace)
+               if ok {
+                       ps.ServiceIndex.instancesByPort[svcKey] = shards.CopyEndpoints(portMap)
+               }

In 1.19.8's InstancesByPort will filter by port.
So after removing the comparison logic, we will see two endpoints.

// InstancesByPort retrieves instances for a service on the given ports with labels that
// match any of the supplied labels. All instances match an empty tag list.
func (s *Controller) InstancesByPort(svc *model.Service, port int) []*model.ServiceInstance {
	out := make([]*model.ServiceInstance, 0)
	s.mutex.RLock()
	instanceLists := s.serviceInstances.getByKey(instancesKey{svc.Hostname, svc.Attributes.Namespace})
	s.mutex.RUnlock()
	for _, instance := range instanceLists {
		if portMatchSingle(instance, port) {
			out = append(out, instance)
		}
	}

	return out
}

I have not reproduced the issue, so I cannot guarantee that my understanding is correct.

from istio.

j2gg0s avatar j2gg0s commented on June 2, 2024

My understanding of this. The actions are primarily caused by changes within the initServiceRegistry.

1.20.4 vs 1.20.3

#49573 has modified this part of the code within the initServiceRegistry.

                shards, ok := env.EndpointIndex.ShardsForService(string(s.Hostname), s.Attributes.Namespace)
                if ok {
-                       ps.ServiceIndex.instancesByPort[svcKey] = shards.CopyEndpoints(portMap)
+                       instancesByPort := shards.CopyEndpoints(portMap)
+                       for port, instances := range instancesByPort {
+                               ps.ServiceIndex.instancesByPort[svcKey][port] = instances
+                       }
                }

The premise for the problem is: Two ServiceEntry within the same namespace have the same host and portName, but different ports.

So in 1.20.3, we only can see one cluster

1.20.0 vs 1.19.8

#46329 Remove service.InstancesByPort

-               svcKey := s.Key()
-               // Precache instances
+               portMap := map[string]int{}
                for _, port := range s.Ports {
-                       if _, ok := ps.ServiceIndex.instancesByPort[svcKey]; !ok {
-                               ps.ServiceIndex.instancesByPort[svcKey] = make(map[int][]*ServiceInstance)
-                       }
-                       instances := make([]*ServiceInstance, 0)
-                       instances = append(instances, env.InstancesByPort(s, port.Port)...)
-                       ps.ServiceIndex.instancesByPort[svcKey][port.Port] = instances
+                       portMap[port.Name] = port.Port
                }

+               svcKey := s.Key()
+               if _, ok := ps.ServiceIndex.instancesByPort[svcKey]; !ok {
+                       ps.ServiceIndex.instancesByPort[svcKey] = make(map[int][]*IstioEndpoint)
+               }
+               shards, ok := env.EndpointIndex.ShardsForService(string(s.Hostname), s.Attributes.Namespace)
+               if ok {
+                       ps.ServiceIndex.instancesByPort[svcKey] = shards.CopyEndpoints(portMap)
+               }

In 1.19.8's InstancesByPort will filter by port. So after removing the comparison logic, we will see two endpoints.

// InstancesByPort retrieves instances for a service on the given ports with labels that
// match any of the supplied labels. All instances match an empty tag list.
func (s *Controller) InstancesByPort(svc *model.Service, port int) []*model.ServiceInstance {
	out := make([]*model.ServiceInstance, 0)
	s.mutex.RLock()
	instanceLists := s.serviceInstances.getByKey(instancesKey{svc.Hostname, svc.Attributes.Namespace})
	s.mutex.RUnlock()
	for _, instance := range instanceLists {
		if portMatchSingle(instance, port) {
			out = append(out, instance)
		}
	}

	return out
}

I have not reproduced the issue, so I cannot guarantee that my understanding is correct.

Considering that ServiceEntry to port.number and port.targetPort maybe different, it is difficult to distinguish abnormal situations within initServiceRegistry.

from istio.

hzxuzhonghu avatar hzxuzhonghu commented on June 2, 2024

We rely on service port name in many places> in K8s service, it must be unique within a service, but it is tricky in istio, because we support merging SEs.

It is not possible to prevent creating two SEs with same port name as they can be created concurrently, But what can be done is warn in istio if there exists port name equals like this case.

from istio.

zirain avatar zirain commented on June 2, 2024

find an issue, following configuration works in 1.18, but fails in 1.20+(not sure about 1.19)

apiVersion: v1
kind: Service
metadata:
  name: httpbin-ext
spec:
  externalName: httpbin.default.svc.cluster.local
  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8000
  type: ExternalName
---
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: httpbin-ext
spec:
  hosts:
    - httpbin.default.svc.cluster.local
  location: MESH_EXTERNAL
  ports:
    - name: http
      number: 8000
      protocol: HTTP
  resolution: DNS

in 1.20+, pilot will send eds as following rejected by proxy with error

2024-04-25T12:21:22.487063Z	warning	envoy config external/envoy/source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:138	gRPC config for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment rejected: malformed IP address: httpbin.default.svc.cluster.local. Consider setting resolver_name or setting cluster type to 'STRICT_DNS' or 'LOGICAL_DNS'	thread=17
{
    "clusterName": "outbound|8000||httpbin.default.svc.cluster.local",
    "endpoints": [
        {
            "locality": {},
            "lbEndpoints": [
                {
                    "endpoint": {
                        "address": {
                            "socketAddress": {
                                "address": "httpbin.default.svc.cluster.local",
                                "portValue": 8000
                            }
                        }
                    },
                    "metadata": {
                        "filterMetadata": {
                            "istio": {
                                "workload": ";;;;"
                            }
                        }
                    },
                    "loadBalancingWeight": 1
                },
                {
                    "endpoint": {
                        "address": {
                            "socketAddress": {
                                "address": "10.244.0.86",
                                "portValue": 8080
                            }
                        }
                    },
                    "healthStatus": "HEALTHY",
                    "metadata": {
                        "filterMetadata": {
                            "envoy.transport_socket_match": {
                                "tlsMode": "istio"
                            },
                            "istio": {
                                "workload": "httpbin;default;httpbin;v1;Kubernetes"
                            }
                        }
                    },
                    "loadBalancingWeight": 1
                }
            ],
            "loadBalancingWeight": 2
        }
    ]
}

from istio.

hzxuzhonghu avatar hzxuzhonghu commented on June 2, 2024

Let me investigate this regression

from istio.

hzxuzhonghu avatar hzxuzhonghu commented on June 2, 2024

@howardjohn I kind of remember you changed using the target port of externalName service, which maybe related

from istio.

howardjohn avatar howardjohn commented on June 2, 2024

definitely not the cause of the original issue. maybe @zirain s issue

from istio.

zirain avatar zirain commented on June 2, 2024

before(1.18):

"httpbin.default.svc.cluster.local": {
        "default": {
            "Shards": {
                "Kubernetes/Kubernetes": [
                    {
                        "Labels": {
                            "app": "httpbin",
                            "kubernetes.io/hostname": "envoy-gateway-control-plane",
                            "pod-template-hash": "86b8ffc5ff",
                            "security.istio.io/tlsMode": "istio",
                            "service.istio.io/canonical-name": "httpbin",
                            "service.istio.io/canonical-revision": "v1",
                            "topology.istio.io/cluster": "Kubernetes",
                            "topology.istio.io/network": "",
                            "version": "v1"
                        },
                        "Address": "10.244.0.121",
                        "ServicePortName": "http",
                        "ServiceAccount": "spiffe://cluster.local/ns/default/sa/httpbin",
                        "Network": "",
                        "Locality": {
                            "Label": "",
                            "ClusterID": "Kubernetes"
                        },
                        "EndpointPort": 8080,
                        "LbWeight": 0,
                        "TLSMode": "istio",
                        "Namespace": "default",
                        "WorkloadName": "httpbin",
                        "HostName": "",
                        "SubDomain": "",
                        "HealthStatus": 1,
                        "NodeName": "envoy-gateway-control-plane"
                    }
                ]
            },
            "ServiceAccounts": {
                "spiffe://cluster.local/ns/default/sa/httpbin": {}
            }
        }
    }

after(1.21):

"httpbin.default.svc.cluster.local": {
        "default": {
            "Shards": {
                "External/Kubernetes": [
                    {
                        "Labels": null,
                        "Address": "httpbin.default.svc.cluster.local",
                        "ServicePortName": "http",
                        "ServiceAccount": "",
                        "Network": "",
                        "Locality": {
                            "Label": "",
                            "ClusterID": ""
                        },
                        "EndpointPort": 8000,
                        "LbWeight": 0,
                        "TLSMode": "disabled",
                        "Namespace": "",
                        "WorkloadName": "",
                        "HostName": "",
                        "SubDomain": "",
                        "HealthStatus": 0,
                        "NodeName": ""
                    }
                ],
                "Kubernetes/Kubernetes": [
                    {
                        "Labels": {
                            "app": "httpbin",
                            "kubernetes.io/hostname": "envoy-gateway-control-plane",
                            "pod-template-hash": "86b8ffc5ff",
                            "security.istio.io/tlsMode": "istio",
                            "service.istio.io/canonical-name": "httpbin",
                            "service.istio.io/canonical-revision": "v1",
                            "topology.istio.io/cluster": "Kubernetes",
                            "topology.istio.io/network": "",
                            "version": "v1"
                        },
                        "Address": "10.244.0.117",
                        "ServicePortName": "http",
                        "ServiceAccount": "spiffe://cluster.local/ns/default/sa/httpbin",
                        "Network": "",
                        "Locality": {
                            "Label": "",
                            "ClusterID": "Kubernetes"
                        },
                        "EndpointPort": 8080,
                        "LbWeight": 0,
                        "TLSMode": "istio",
                        "Namespace": "default",
                        "WorkloadName": "httpbin",
                        "HostName": "",
                        "SubDomain": "",
                        "HealthStatus": 1,
                        "NodeName": "envoy-gateway-control-plane"
                    }
                ]
            },
            "ServiceAccounts": {
                "spiffe://cluster.local/ns/default/sa/httpbin": {}
            }
        }
    }

from istio.

ymesika avatar ymesika commented on June 2, 2024

@zirain doesn't it deserve its own different issue?

from istio.

hzxuzhonghu avatar hzxuzhonghu commented on June 2, 2024

There maybe some bug with delta cluster builder.

If i create the Ses after a sidecar started, it will only see a cluster

root@kurator-linux-0001:~/go/src/istio.io/istio# istioctl pc cluster sleep-7656cf8794-srf59   |grep google
news.google.com                                         8080      -          outbound      STRICT_DNS    

But if i create the ses before the sidecar started, ii will get two clusters

root@kurator-linux-0001:~/go/src/istio.io/istio# k get pod
NAME                       READY   STATUS    RESTARTS   AGE
httpbin-86b8ffc5ff-b5cx9   2/2     Running   0          91m
sleep-7656cf8794-srf59     2/2     Running   0          35s

from istio.

howardjohn avatar howardjohn commented on June 2, 2024

There maybe some bug with delta cluster builder.

If i create the Ses after a sidecar started, it will only see a cluster

root@kurator-linux-0001:~/go/src/istio.io/istio# istioctl pc cluster sleep-7656cf8794-srf59   |grep google
news.google.com                                         8080      -          outbound      STRICT_DNS    

But if i create the ses before the sidecar started, ii will get two clusters

root@kurator-linux-0001:~/go/src/istio.io/istio# k get pod
NAME                       READY   STATUS    RESTARTS   AGE
httpbin-86b8ffc5ff-b5cx9   2/2     Running   0          91m
sleep-7656cf8794-srf59     2/2     Running   0          35s

👍 I see the same

from istio.

howardjohn avatar howardjohn commented on June 2, 2024

There maybe some bug with delta cluster builder.

Good find, this is a regression in 1.22. Fixed it up in #50712

from istio.

zirain avatar zirain commented on June 2, 2024

@howardjohn @hzxuzhonghu is this fixed by #50691?

from istio.

howardjohn avatar howardjohn commented on June 2, 2024

No, by #50711. I am finishing up tests, will be done in an hour

from istio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.