Code Monkey home page Code Monkey logo

pulsar-helm-chart's Issues

Zookeeper Error: Could not find or load main class #

Describe the bug
I tried to create a new cluster from scratch but the zookeeper won't start.
For testing I changed component sizes to:
1 x proxy
1 x broker
2 x zookeeper
3 x bookies
1 x recovery
I configured cert-manager to provide self signed certs with JKS and PKCS12.

Log:

[conf/zookeeper.conf] Applying config dataDir = /pulsar/data/zookeeper
[conf/zookeeper.conf] Adding config secureClientPort = 2281
[conf/zookeeper.conf] Adding config serverCnxnFactory = org.apache.zookeeper.server.NettyServerCnxnFactory
processing /pulsar/certs/zookeeper/tls.crt : len = 1724
processing /pulsar/certs/zookeeper/tls.key : len = 3272
processing /pulsar/certs/ca/ca.crt : len = 1302
Importing keystore /pulsar/zookeeper.p12 to /pulsar/zookeeper.keystore.jks...

Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /pulsar/zookeeper.keystore.jks -destkeystore /pulsar/zookeeper.keystore.jks -deststoretype pkcs12".
Certificate was added to keystore
processing /pulsar/zookeeper.keystore.jks : len = 3795
processing /pulsar/zookeeper.truststore.jks : len = 1038
Current server id 1
Error: Could not find or load main class #

To Reproduce
Steps to reproduce the behavior:

  1. helm install --set initialize=true pulsar -f values.yaml ./pulsar

Expected behavior
Succesfully init cluster with near default values

Screenshots
image

Desktop (please complete the following information):

  • OS: IOS
  • Kubernetes: 1.18.6
  • Helm: 3.3.1

Additional context
Add any other context about the problem here.

Replicate support for logger level found in StreamNative chart

Is your feature request related to a problem? Please describe.
A feature already developed in the StreamNative chart supports changing the logger level dynamically via a deployment for individual components of the cluster. https://github.com/streamnative/charts/pull/35/commits

Describe the solution you'd like
I would like for this solution to be merged into this chart. I am happy to create a PR for it if the team feels it is necessary.

Describe alternatives you've considered
I've considered doing the work myself and not contributing.

Pulsar components not starting when user provided zookeepers are used

Describe the bug
Pulsar components not starting when user provided zookeepers are used

To Reproduce
Use the following values.yaml when installing pulsar

namespace: pulsar

components:
  zookeeper: false

volumes:
  persistence: true
  local_storage: false

monitoring:
  prometheus: false
  grafana: false
  node_exporter: false
  alert_manager: false

pulsar_metadata:
  userProvidedZookeepers: "zookeeper.zookeeper:2181"

bookkeeper:
  replicaCount: 2
  volumes:
    journal:
      size: 2Gi
    ledgers:
      size: 5Gi

broker:
  replicaCount: 1

proxy:
  replicaCount: 1

Execute kubectl logs pulsar-bookie-0 -n pulsar -c pulsar-bookkeeper-verify-clusterid and see repeating exceptions

08:28:20.955 [main-SendThread(pulsar-zookeeper:2181)] WARN  org.apache.zookeeper.ClientCnxn - Session 0x0 for server pulsar-zookeeper:2181, unexpected error, closing socket connection and attempting reconnect
java.lang.IllegalArgumentException: Unable to canonicalize address pulsar-zookeeper:2181 because it's not resolvable
        at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:71) ~[org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
        at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:39) ~[org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
        at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1087) ~[org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1139) [org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
08:28:22.056 [main-SendThread(pulsar-zookeeper:2181)] ERROR org.apache.zookeeper.client.StaticHostProvider - Unable to resolve address: pulsar-zookeeper:2181
java.net.UnknownHostException: pulsar-zookeeper
        at java.net.InetAddress.getAllByName0(InetAddress.java:1281) ~[?:1.8.0_252]
        at java.net.InetAddress.getAllByName(InetAddress.java:1193) ~[?:1.8.0_252]
        at java.net.InetAddress.getAllByName(InetAddress.java:1127) ~[?:1.8.0_252]
        at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:92) ~[org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
        at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:147) [org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
        at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:375) [org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1137) [org.apache.pulsar-pulsar-zookeeper-2.6.0.jar:2.6.0]

Expected behavior
Pulsar starts and connects to provided zookeeper.

Additional context

$ helm show chart apache/pulsar
apiVersion: v1
appVersion: "1.0"
description: Apache Pulsar Helm chart for Kubernetes
home: https://pulsar.apache.org
icon: http://pulsar.apache.org/img/pulsar.svg
maintainers:
- email: [email protected]
  name: The Apache Pulsar Team
name: pulsar
sources:
- https://github.com/apache/pulsar
version: 2.6.0

Helm chart doesn't work with older Pulsar versions

Describe the bug
When using the official Helm chart to deploy a K8s based Pulsar cluster for version 2.6.x, the broker pod is stuck in " wait-bookkeeper-ready" check although the bookie pod is up and running without any issue, see Pod list below:

$ kubectl -n pulsar get pod
NAME                                             READY   STATUS      RESTARTS   AGE
mytest-pulsar1-bookie-0                          1/1     Running     0          7m14s
mytest-pulsar1-bookie-init-z8gtb                 0/1     Completed   0          7m14s
mytest-pulsar1-broker-0                          0/1     Init:1/2    0          7m14s
mytest-pulsar1-grafana-7bcb854cf4-lmbmj          1/1     Running     0          7m15s
mytest-pulsar1-prometheus-6f79d5c86c-2fdvt       1/1     Running     0          7m15s
mytest-pulsar1-proxy-0                           0/1     Init:1/2    0          7m14s
mytest-pulsar1-pulsar-init-hln7v                 0/1     Completed   0          7m14s
mytest-pulsar1-pulsar-manager-6959fb64d4-tl65f   1/1     Running     0          7m15s
mytest-pulsar1-recovery-0                        1/1     Running     0          7m15s
mytest-pulsar1-toolset-0                         1/1     Running     0          7m15s
mytest-pulsar1-zookeeper-0                       1/1     Running     0          7m14s

It looks like it is stuck in the following check of "wait-bookkeeper-ready" init container

until bin/bookkeeper shell whatisinstanceid; do
        echo "bookkeeper cluster is not initialized yet. backoff for 3 seconds ...";
        sleep 3;
      done;

When I manually run command "bin/bookkeeper shell whatisinstanceid" in "wait-bookkeeper-ready" init container, the result is as below and it looks fine to me.

...
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=5.4.0-1029-gke
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/pulsar
20:53:30.932 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.memory.free=899MB
20:53:30.933 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.memory.max=1024MB
20:53:30.933 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.memory.total=1024MB
20:53:30.940 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=mytest-pulsar1-zookeeper:2181 sessionTimeout=30000 watcher=org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase@5e4bd84a
20:53:30.948 [main] INFO  org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
20:53:30.957 [main] INFO  org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
20:53:30.967 [main] INFO  org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=
20:53:30.986 [main-SendThread(mytest-pulsar1-zookeeper:2181)] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server mytest-pulsar1-zookeeper/10.100.1.36:2181. Will not attempt to authenticate using SASL (unknown error)
20:53:30.994 [main-SendThread(mytest-pulsar1-zookeeper:2181)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session, client: /10.100.0.168:55470, server: mytest-pulsar1-zookeeper/10.100.1.36:2181
20:53:31.007 [main-SendThread(mytest-pulsar1-zookeeper:2181)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server mytest-pulsar1-zookeeper/10.100.1.36:2181, sessionid = 0x10000b712950011, negotiated timeout = 30000
20:53:31.011 [main-EventThread] INFO  org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client is connected now.
20:53:31.040 [main] INFO  org.apache.bookkeeper.tools.cli.commands.bookies.InstanceIdCommand - Metadata Service Uri: zk+null://mytest-pulsar1-zookeeper:2181/ledgers InstanceId: 0b091500-6750-479f-b419-25d957d5a4e0
20:53:31.147 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x10000b712950011
20:53:31.148 [main] INFO  org.apache.zookeeper.ZooKeeper - Session: 0x10000b712950011 closed

To Reproduce
Just follow the official procedure except specify an older Pulsar version in the value.yaml file, as below:

mages:
  zookeeper:
    repository: apachepulsar/pulsar-all
    tag: 2.6.1
    pullPolicy: IfNotPresent
  bookie:
    repository: apachepulsar/pulsar-all
    tag: 2.6.1
    pullPolicy: IfNotPresent
  autorecovery:
    repository: apachepulsar/pulsar-all
    tag: 2.6.1
    pullPolicy: IfNotPresent
  broker:
    repository: apachepulsar/pulsar-all
    tag: 2.6.1
    pullPolicy: IfNotPresent
  proxy:
    repository: apachepulsar/pulsar-all
    tag: 2.6.1
    pullPolicy: IfNotPresent
  functions:
    repository: apachepulsar/pulsar-all
    tag: 2.6.1

I also tested ont version 2.6.2 and this time both bookie and broker Pods got stuck in the same check.
The chart works with 2.7.0 though without any issue.

Expected behavior
All Pulsar Pods should be up and running, as expected and demonstrated with version 2.7.0

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • GKE (Ubuntu); K8s version: 1.17.14-gke.1600

Additional context
Add any other context about the problem here.

pods for broker/bookkeeper init failed - Could not find or load main class "-Xms1g error

Describe the bug
Init container pulsar-bookkeeper-verify-clusterid fail fails to start with following stack trace

[conf/bookkeeper.conf] Applying config ledgerDirectories = /pulsar/data/bookkeeper/ledgers
[conf/bookkeeper.conf] Applying config statsProviderClass = org.apache.bookkeeper.stats.prometheus.PrometheusMetricsProvider
[conf/bookkeeper.conf] Applying config useHostNameAsBookieID = true
[conf/bookkeeper.conf] Applying config zkLedgersRootPath = /ledgers
[conf/bookkeeper.conf] Applying config zkServers = pulsar-zookeeper:2181
[conf/bookkeeper.conf] Adding config journalDirectories = /pulsar/data/bookkeeper/journal
JMX enabled by default
Error: Could not find or load main class "-Xms1g
JMX enabled by default
Error: Could not find or load main class "-Xms1g
JMX enabled by default
Error: Could not find or load main class "-Xms1g
.
.

To Reproduce
Steps to reproduce the behavior:

  1. Package a chart with this repo(no change in config)
  2. broker, recovery, proxy, bookkeeper pod creation fails
  3. check init container logs kubectl logs pulsar-bookkeeper-0 -c pulsar-bookkeeper-verify-clusterid -n pulsar

Env
Azure AKS

Support multi volume in bookkeeper

Is your feature request related to a problem? Please describe.
No.

Describe the solution you'd like
Add option for user to choose whether using multi volume in bookeeper, especially while using local-storage.

Describe alternatives you've considered
I would like to add a PR for this, which works well in my cluster.

Additional context
After using multi volume, user should update the Ledgers Disk Usage (Bookie Metric) query manually for precise values.

Change Pulsar Manager docker image version

When activating JWT token authentication the Pulsar manager is not connecting to broker. The manager image v0.1.0 does not support JWT configuration.

In the new version v0.2.0 the support is there:

The solution is to change the docker version of the docker images from the helm chart.

The actual config in helm chart version pulsar-2.6.2-1 :
image

The correction should look like:
image

Bump missed out pulsar-image tags to 2.6.0

Describe the bug
Few Pulsar image tags are pointing to 2.5.0

References

  1. .ci/clusters/values-pulsar-image.yaml
  2. charts/pulsar/values.yaml
  3. examples/values-one-node.yaml
  4. examples/values-pulsar.yaml

Expected behavior
Pulsar Image tags should point to 2.6.0

Decouple credentials from key secrets generation

Is your feature request related to a problem? Please describe.

As suggested here: https://pulsar.apache.org/docs/en/helm-deploy/#prepare-the-helm-release. The prepare_helm_release.sh script provided with this Helm chart can create a secret credentials resource and

The username and password are used for logging into Grafana dashboard and Pulsar Manager.

However, I haven't been able to make use of such a feature for a number of reasons:

  1. This secret doesn't seem to affect the pulsar-manager-deployment.yaml definition. Instead, the ./templates/pulsar-manager-admin-secret.yaml seems to be the one providing the credentials for the pulsar manager (UI) (with the added possibility to overwrite via values.yaml at pulsar_manager.admin.user/password).

  2. Using the Pulsar chart as a dependency for an umbrella chart (this is currently my use case), will bring extra hassle that will make it very hard to have all resources follow the same naming structure, thus causing some resources to never be deployed successfully e.g.: ./templates/grafana-deployment.yaml will complain that it couldn't find the secret created by the bash script. Attempting to fix this issue via the -k flag passed to the script will cause the JWT secret tokens to have a name that's unexpected by the broker, etc.

Describe the solution you'd like

The bash script should focus on the JWT secrets creation and nothing else.

The generation of user/password credentials for Grafana and Pulsar Manager should be delegated to individual secret resources that expose such credentials for overriding via values.yaml. This provides flexibility in credentials usage/overwrite on a per-component basis.

Please advise.

Cert-manager resources api-version is hardcoded

Describe the bug
Enabling certs.internal_issuer.enabled=true results in failed setup as CRD for cert-managers resources does not exist:

Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Certificate" in version "cert-manager.io/v1alpha2", unable to recognize "": no matches for kind "Issuer" in version "cert-manager.io/v1alpha2"]

cert-manager has released a stable version with updated apiVersion: cert-manager.io/v1
fyi: there are many cert-manager api versions v1alpha2,v1alpha3,v1beta1

To Reproduce
helm upgrade --install pulsar -n pulsar apache/pulsar --values certs.internal_issuer.enabled=true

Expected behavior
Configurable external CRD api-versions in value.yaml

Broker ClusterRole resources bug

Describe the bug
Get some error log on k8s.

10:33:59.441 [Timer-0] ERROR org.apache.pulsar.functions.runtime.kubernetes.KubernetesRuntimeFactory - Error while trying to fetch configmap pulsar-mini-functions-worker-config at namespace default
io.kubernetes.client.openapi.ApiException: Forbidden
	at io.kubernetes.client.openapi.ApiClient.handleResponse(ApiClient.java:971) ~[io.kubernetes-client-java-api-9.0.2.jar:?]
	at io.kubernetes.client.openapi.ApiClient.execute(ApiClient.java:883) ~[io.kubernetes-client-java-api-9.0.2.jar:?]
	at io.kubernetes.client.openapi.apis.CoreV1Api.readNamespacedConfigMapWithHttpInfo(CoreV1Api.java:44821) ~[io.kubernetes-client-java-api-9.0.2.jar:?]
	at io.kubernetes.client.openapi.apis.CoreV1Api.readNamespacedConfigMap(CoreV1Api.java:44791) ~[io.kubernetes-client-java-api-9.0.2.jar:?]
	at org.apache.pulsar.functions.runtime.kubernetes.KubernetesRuntimeFactory.fetchConfigMap(KubernetesRuntimeFactory.java:369) [org.apache.pulsar-pulsar-functions-runtime-2.7.0.jar:2.7.0]
	at org.apache.pulsar.functions.runtime.kubernetes.KubernetesRuntimeFactory$1.run(KubernetesRuntimeFactory.java:358) [org.apache.pulsar-pulsar-functions-runtime-2.7.0.jar:2.7.0]
	at java.util.TimerThread.mainLoop(Timer.java:555) [?:1.8.0_275]
	at java.util.TimerThread.run(Timer.java:505) [?:1.8.0_275]

To Reproduce
Steps to reproduce the behavior:

  1. deploy pulsar on k8s

Expected behavior
no forbidden

Desktop (please complete the following information):

  • OS: k8s

Impossible to upgrade fully from 2.5.0 to 2.6.0 using Helm

While trying to upgrade our 2.5.0 deployment to 2.6.0 we encountered an issue.

Describe the bug
When using Helm to upgrade from 2.5.0 to 2.6.0, the process fails on the Init Jobs.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy a 2.5.0 version of Pulsar using Helm
  2. Upgrade it to 2.6.0 still using Helm
  3. Get the (nasty/cryptic) following error message
Error: UPGRADE FAILED: cannot patch "pulsar-staging-bookie-init" with kind Job: Job.batch "pulsar-staging-bookie-init" is invalid: spec.template: Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"controller-uid":"1951b64b-f3ff-42cf-ad9b-ed1dea7dfd08", "job-name":"pulsar-staging-bookie-init"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:core.PodSpec{Volumes:[]core.Volume(nil), InitContainers:[]core.Container{core.Container{Name:"wait-zookeeper-ready", Image:"apachepulsar/pulsar-all:2.6.0", Command:[]string{"sh", "-c"}, Args:[]string{"until nslookup pulsar-staging-zookeeper-2.pulsar-staging-zookeeper.pulsar; do\n  sleep 3;\ndone;"}, WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, Containers:[]core.Container{core.Container{Name:"pulsar-staging-bookie-init", Image:"apachepulsar/pulsar-all:2.6.0", Command:[]string{"sh", "-c"}, Args:[]string{"bin/apply-config-from-env.py conf/bookkeeper.conf;\nif bin/bookkeeper shell whatisinstanceid; then\n    echo \"bookkeeper cluster already initialized\";\nelse\n    bin/bookkeeper shell initnewcluster;\nfi\n"}, WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource{core.EnvFromSource{Prefix:"", ConfigMapRef:(*core.ConfigMapEnvSource)(0xc010895aa0), SecretRef:(*core.SecretEnvSource)(nil)}}, Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]core.EphemeralContainer(nil), RestartPolicy:"Never", TerminationGracePeriodSeconds:(*int64)(0xc008116c80), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"ClusterFirst", NodeSelector:map[string]string(nil), ServiceAccountName:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", SecurityContext:(*core.PodSecurityContext)(0xc0158c69a0), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*core.Affinity)(nil), SchedulerName:"default-scheduler", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), PreemptionPolicy:(*core.PreemptionPolicy)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), Overhead:core.ResourceList(nil), EnableServiceLinks:(*bool)(nil), TopologySpreadConstraints:[]core.TopologySpreadConstraint(nil)}}: field is immutable && cannot patch "pulsar-staging-pulsar-init" with kind Job: Job.batch "pulsar-staging-pulsar-init" is invalid: spec.template: Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"controller-uid":"f967eb35-a5ce-4a44-97b2-12d263b2df4c", "job-name":"pulsar-staging-pulsar-init"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:core.PodSpec{Volumes:[]core.Volume(nil), InitContainers:[]core.Container{core.Container{Name:"wait-zookeeper-ready", Image:"apachepulsar/pulsar-all:2.6.0", Command:[]string{"sh", "-c"}, Args:[]string{"until nslookup pulsar-staging-zookeeper-2.pulsar-staging-zookeeper.pulsar; do\n  sleep 3;\ndone;"}, WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}, core.Container{Name:"pulsar-bookkeeper-verify-clusterid", Image:"apachepulsar/pulsar-all:2.6.0", Command:[]string{"sh", "-c"}, Args:[]string{"bin/apply-config-from-env.py conf/bookkeeper.conf;\nuntil bin/bookkeeper shell whatisinstanceid; do\n  sleep 3;\ndone;\n"}, WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource{core.EnvFromSource{Prefix:"", ConfigMapRef:(*core.ConfigMapEnvSource)(0xc015b2bec0), SecretRef:(*core.SecretEnvSource)(nil)}}, Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, Containers:[]core.Container{core.Container{Name:"pulsar-staging-pulsar-init", Image:"apachepulsar/pulsar-all:2.6.0", Command:[]string{"sh", "-c"}, Args:[]string{"\nbin/pulsar initialize-cluster-metadata \\\n  --cluster pulsar-staging \\\n  --zookeeper pulsar-staging-zookeeper:2181 \\\n  --configuration-store pulsar-staging-zookeeper:2181 \\\n  --web-service-url http://pulsar-staging-broker.pulsar.svc.cluster.local:8080/ \\\n  --web-service-url-tls https://pulsar-staging-broker.pulsar.svc.cluster.local:8443/ \\\n  --broker-service-url pulsar://pulsar-staging-broker.pulsar.svc.cluster.local:6650/ \\\n  --broker-service-url-tls pulsar+ssl://pulsar-staging-broker.pulsar.svc.cluster.local:6651/ || true;\n"}, WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount(nil), VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]core.EphemeralContainer(nil), RestartPolicy:"Never", TerminationGracePeriodSeconds:(*int64)(0xc017ff6e98), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"ClusterFirst", NodeSelector:map[string]string(nil), ServiceAccountName:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", SecurityContext:(*core.PodSecurityContext)(0xc017c5a070), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*core.Affinity)(nil), SchedulerName:"default-scheduler", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), PreemptionPolicy:(*core.PreemptionPolicy)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), Overhead:core.ResourceList(nil), EnableServiceLinks:(*bool)(nil), TopologySpreadConstraints:[]core.TopologySpreadConstraint(nil)}}: field is immutable

Expected behavior
Helm upgrade completing successfully.

Additional context
After a few googleing, it seems that changing a container image inside a Job is not supported (as it turns out trying to change the image using a command like kubectl edit jobs.batch pulsar-pulsar-init yield the same error message.

Our workaround was to delete the 2 jobs pre-upgrade. As it seems to be a Kubernetes limitation, the upgrade doc should probably be updated to avoid confusion.

Ingress hostname should not be required

Describe the bug
When creating ingresses for services such as the proxy or pulsar-manager, the hostname value should optional.

To Reproduce
Steps to reproduce the behavior:

  1. Create an ingress without a hostname specified
proxy:
  ingress:
    enabled: true
    path: "/pulsar-manager/"
  1. Try installing or upgrading the chart

Expected behavior
An ingress should be created.

Additional context
I can submit a PR to make the hostname optional.

Grafana dashboards not entirely working

Describe the bug
Today me & @nickelozz created a new Pulsar install and noticed some broken Grafana.

Broken dashboards:
Proxy Metrics: Empty
Node Metrics: Empty
Overview: Storage/Backlog -> only queries returning single series or tables are supported; Nodes section -> no data
Pulsar Logs: Empty

To Reproduce
Steps to reproduce the behavior:

  1. Fresh helm install from this Github repo
  2. Establish consumer & producer through proxy
  3. Open up Grafana
  4. Go to the listed dashboards

Expected behavior
Dashboards load the correct data, helm includes an agent for missing data or instructions on installing agent if required (example the node metrics).

Screenshots
Selection_235
Selection_236
Selection_237
Selection_238

Additional context
We're using the latest helm commit here 06652d7

Helm chart fails to install with TLS and Auth enabled - failed to sync secret cache: timed out waiting for the condition

Describe the bug

After enabling TLS and Authentication, the helm chart fails to install. (The pods hang in an invalid state.)
The issue preventing the pods from starting appears to be this:

MountVolume.SetUp failed for volume "zookeeper-certs" : failed to sync secret cache: timed out waiting for the condition

It is not clear why the secret cache is timing out.

To Reproduce
Here are the exact steps to reproduce this issue:

$ git clone https://github.com/apache/pulsar-helm-chart
$ cd pulsar-helm-chart
$ cat > ./examples/values-minikube.yaml
volumes:
  persistence: false
affinity:
  anti_affinity: false
components:
  autorecovery: false
zookeeper:
  replicaCount: 1
bookkeeper:
  replicaCount: 1
broker:
  replicaCount: 1
  configData:
    autoSkipNonRecoverableData: "true"
    managedLedgerDefaultEnsembleSize: "1"
    managedLedgerDefaultWriteQuorum: "1"
    managedLedgerDefaultAckQuorum: "1"
proxy:
  replicaCount: 1
tls:
  enabled: true
  bookie:
    enabled: true
  autorecovery:
    enabled: true
  toolset:
    enabled: true
  proxy:
    enabled: true
  broker:
    enabled: true
  zookeeper:
    enabled: true
auth:
  authentication:
    enabled: false
    provider: "jwt"
    jwt:
      usingSecretKey: false
  authorization:
    enabled: true
  superUsers:
    broker: "broker-admin"
    proxy: "proxy-admin"
    client: "client-admin"

(ctrl + c)

$ minikube start --memory=8192 --cpus=4
$ ./scripts/pulsar/prepare_helm_release.sh -n pulsar -k pulsar-mini -c --pulsar-superusers superadmin,proxy-admin,broker-admin,client-admin
$ ./scripts/pulsar/upload_tls.sh -k pulsar-mini -d ./.ci/tls
$ helm install --values examples/values-minikube.yaml pulsar-mini apache/pulsar

$ kubectl get pods -n pulsar
shows them hanging in incomplete state

$ kubectl describe pods -n pulsar
shows this issue:

Warning FailedMount 7m11s kubelet MountVolume.SetUp failed for volume "zookeeper-certs" : failed to sync secret cache: timed out waiting for the condition

Here is some additional context provided when running describe on the zookeeper pod:

Name: pulsar-mini-zookeeper-0
Namespace: pulsar
Priority: 0
Node: minikube/192.168.49.2
Start Time: Mon, 09 Nov 2020 23:45:21 -0700
Labels: app=pulsar
cluster=pulsar-mini
component=zookeeper
controller-revision-hash=pulsar-mini-zookeeper-59c4465569
release=pulsar-mini
statefulset.kubernetes.io/pod-name=pulsar-mini-zookeeper-0
Annotations: prometheus.io/port: 8000
prometheus.io/scrape: true
Status: Pending
IP:
IPs:
Controlled By: StatefulSet/pulsar-mini-zookeeper
Containers:
pulsar-mini-zookeeper:
Container ID:
Image: apachepulsar/pulsar-all:2.6.0
Image ID:
Ports: 2181/TCP, 2888/TCP, 3888/TCP, 2281/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
sh
-c
Args:
bin/apply-config-from-env.py conf/zookeeper.conf;
/pulsar/keytool/keytool.sh zookeeper ${HOSTNAME}.pulsar-mini-zookeeper.pulsar.svc.cluster.local false; bin/generate-zookeeper-config.sh conf/zookeeper.conf; bin/pulsar zookeeper;

State:          Waiting
  Reason:       ContainerCreating
Ready:          False
Restart Count:  0
Requests:
  cpu:      100m
  memory:   256Mi
Liveness:   exec [bin/pulsar-zookeeper-ruok.sh] delay=10s timeout=1s period=30s #success=1 #failure=10
Readiness:  exec [bin/pulsar-zookeeper-ruok.sh] delay=10s timeout=1s period=30s #success=1 #failure=10
Environment Variables from:
  pulsar-mini-zookeeper  ConfigMap  Optional: false
Environment:
  ZOOKEEPER_SERVERS:  pulsar-mini-zookeeper-0
Mounts:
  /pulsar/certs/ca from ca (ro)
  /pulsar/certs/zookeeper from zookeeper-certs (ro)
  /pulsar/data from pulsar-mini-zookeeper-data (rw)
  /pulsar/keytool/keytool.sh from keytool (rw,path="keytool.sh")
  /var/run/secrets/kubernetes.io/serviceaccount from default-token-vtl5l (ro)

Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
pulsar-mini-zookeeper-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:
zookeeper-certs:
Type: Secret (a volume populated by a Secret)
SecretName: pulsar-mini-tls-zookeeper
Optional: false
ca:
Type: Secret (a volume populated by a Secret)
SecretName: pulsar-mini-ca-tls
Optional: false
keytool:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: pulsar-mini-keytool-configmap
Optional: false
default-token-vtl5l:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-vtl5l
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Normal Scheduled 7m12s default-scheduler Successfully assigned pulsar/pulsar-mini-zookeeper-0 to minikube
Warning FailedMount 7m11s kubelet MountVolume.SetUp failed for volume "zookeeper-certs" : failed to sync secret cache: timed out waiting for the condition
Normal Pulling 7m9s kubelet Pulling image "apachepulsar/pulsar-all:2.6.0"

Expected behavior

Installing the helm chart with the provided values should start the Pulsar cluster in minikube with TLS and authentication enabled.

Environment:

😄 minikube v1.14.2 on Darwin 10.15.7
✨ Using the docker driver based on existing profile
🐳 Preparing Kubernetes v1.19.2 on Docker 19.03.8 ...
🌟 Enabled addons: storage-provisioner, default-storageclass
🏄 kubectl is configured to use "minikube" by default

bookkeeper-statefulset.yaml has indentation issue in volumeMounts array

Describe the bug
volumeMounts:
{{- if .Values.bookkeeper.volumes.useSingleCommonVolume }}

  •      - name: "{{ template "pulsar.fullname" . }}-{{ .Values.bookkeeper.component }}-{{ .Values.bookkeeper.volumes.common.name }}"
    
  •        mountPath: /pulsar/data/bookkeeper
    
  •    - name: "{{ template "pulsar.fullname" . }}-{{ .Values.bookkeeper.component }}-{{ .Values.bookkeeper.volumes.common.name }}"
    
  •      mountPath: /pulsar/data/bookkeeper
    

To Reproduce
Steps to reproduce the behavior:
when setting the tls.zookeeper.enable to true, this issue makes "helm template YAML to JSON error"

Expected behavior
A clear and concise description of what you expected to happen.

Error: unable to build kubernetes objects from release manifest - Validation Error

Describe the bug
Consistently get the following error stack when running helm install

Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(Deployment.spec.template.spec.containers[0].ports[0]): missing required field "containerPort" in io.k8s.api.core.v1.ContainerPort

Assumptions

  1. k8s setup
  2. HELM available

To Reproduce
Steps to reproduce the behavior:

  1. git clone https://github.com/apache/pulsar-helm-chart.git
  2. git checkout master
  3. cd pulsar-helm-chart
  4. scripts/pulsar/prepare_helm_release.sh -n pulsar -c
  5. helm install --generate-name --values=$PWD/examples/values-one-node.yaml charts/pulsar

Expected behavior
NAME: pulsar-******** LAST DEPLOYED: ***** NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None

Screenshots
pulsar-helm-error

Desktop (please complete the following information):

  • macOS - 10.15.5
  • Docker Desktop - 2.3.0.3
  • Kubernetes - v1.16.5
  • HELM - v3.1.1

Pulsar manager always uses default user/password

Expected behavior

Providing non default username and password for Pulsar Admin Manager component should actually change the username and password

Actual behavior

Pulsar admin manager always uses the default username and password (not secure)

Steps to reproduce

helm upgrade --install --set pulsar_manager.admin.password=myspecialpassword --set pulsar_manager.admin.user=myspecialuser --set initialize=true --values examples/values-minikube.yaml --namespace pulsar pulsar-mini apache/pulsar

*I am aware that default chart does not restart pods if the secret changes, but lets say this is fresh install (also the case).

System configuration

Helm chart commit: c059ea2

Using all default values for minkube

Pulsar python function not working with TLS enabled

Describe the bug
functionAuthProviderClassName:org.apache.pulsar.functions.auth.KubernetesSecretsTokenAuthProvider expects that the tlsTrustCertsFilePath: /pulsar/certs/ca/ca.crt is mapped in the functions_worker.yaml. => By adding PF_tlsTrustCertsFilePath: /pulsar/certs/ca/ca.crt in the helm for broker config map (in case TLS is turned on) issue is fixed.

To Reproduce
1.) Deploy with tls.enabled: true in values.yaml
2.) Create some python function with pulsar-admin functions create ...
3.) TLS Handshake not possible because ca.crt and tls-config is not ingested into the function pod.

Expected behavior
If TLS is enabled on broker, python functions should run out of the box without the need to manually adapt helm chart.

Screenshots
"Downloaded successfully"
shardId=0
[2021-01-25 08:43:11 +0000] [INFO] python_instance_main.py: Starting Python instance with Namespace(client_auth_params=None, client_auth_plugin=None, cluster_name='neuron-dev01', dependency_repository=None, expected_healthcheck_interval=-1, extra_dependency_repository=None, function_details='{"tenant":"31000","namespace":"jwt","name":"f_dummy","className":"f_dummy.DummyFunction","logTopic":"31000/jwt/log_partition","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"31000/jwt/inputtopic":{}},"cleanupSubscription":true},"sink":{"topic":"31000/jwt/output","forwardSourceMessageProperty":true},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}', function_id='e0e084c9-62ef-4236-9d12-f79bf13633cd', function_version='e393f52d-3adb-4fd7-97a0-dc7aeae80c3f', hostname_verification_enabled=None, install_usercode_dependencies=True, instance_id='0', logging_config_file='/pulsar/conf/functions-logging/console_logging_config.ini', logging_directory='logs/func...
[2021-01-25 08:43:11 +0000] [INFO] log.py: Setting up producer for log topic 31000/jwt/log_partition
2021-01-25 08:43:11.555 INFO [139914923747136] ConnectionPool:85 | Created connection for pulsar+ssl://pulsar-broker:6651/
2021-01-25 08:43:11.558 INFO [139914856883968] ClientConnection:353 | [10.129.2.96:35334 -> 10.129.2.94:6651] Connected to broker
2021-01-25 08:43:11.564 ERROR [139914856883968] ClientConnection:411 | [10.129.2.96:35334 -> 10.129.2.94:6651] Handshake failed: certificate verify failed
2021-01-25 08:43:11.564 INFO [139914856883968] ClientConnection:1425 | [10.129.2.96:35334 -> 10.129.2.94:6651] Connection closed
2021-01-25 08:43:11.564 ERROR [139914856883968] ClientImpl:181 | Error Checking/Getting Partition Metadata while creating producer on persistent://31000/jwt/log_partition -- ConnectError
2021-01-25 08:43:11.564 INFO [139914856883968] ClientConnection:242 | [10.129.2.96:35334 -> 10.129.2.94:6651] Destroyed connection

Desktop (please complete the following information):

  • OKD 4.6

Additional context
Suggested solution in pulsar/templates/broker-configmap.yaml:

...
{{- if and .Values.tls.enabled .Values.tls.broker.enabled }}
brokerServicePortTls: "{{ .Values.broker.ports.pulsarssl }}"
webServicePortTls: "{{ .Values.broker.ports.https }}"
# TLS Settings
tlsCertificateFilePath: "/pulsar/certs/broker/tls.crt"
tlsKeyFilePath: "/pulsar/certs/broker/tls.key"
tlsTrustCertsFilePath: "/pulsar/certs/ca/ca.crt"
# For functions pods to also run TLS enabled
PF_tlsTrustCertsFilePath: "/pulsar/certs/ca/ca.crt"

{{- end }}
...

Add node_exporter template

Is your feature request related to a problem? Please describe.
The grafana dashboards should display metrics for nodes and containers. Use prometheus node_exporter to export kubernetes node metrics.

Describe the solution you'd like
Define node exporter template as defined at streamnative charts

Additional context
Although monitoring.node_exporter is enabled at values.yml file but it seems no template for node_exporters are defined yet.

symmetric / create_namespace flags were only working if last argument

Describe the bug
symmetric / create_namespace flags were only working if last argument

To Reproduce
Run with a namespace which does not exist:
./scripts/pulsar/prepare_helm_release.sh --create-namespace --namespace pulsar --release pulsar

Expected behavior
Args work in any position

Pulsar manager mount private/public key for token generation

When activate JWT token authentication it will be nice to be able to mount the

  • {{ .Release.Name }}-token-asymmetric-key
    or
  • {{ .Release.Name }}-token-symmetric-key

Evolution

It will be nice to have this feature in the pulsr-manager-deployment.yaml
The image is a example.
image

Helm charts are managing namespace

Describe the bug
Helm charts are managing namespace.

helm does not position it self as a namespace manager, as namespaces in kubernetes are considered as a higher control structure that is not part of the application.

To Reproduce
Use helm for installation without prior syncing namespaces in chart values and installation cmd:

kubectl create ns pulsartest
helm upgrade --install pulsar -n pulsartest apache/pulsar
Error: namespaces "pulsar" not found

even with namespaceCreate: false

Expected behavior
Do not manage kubernetes namespaces in helm charts.

  • in case I failed to convince you, at least use consistent variables to refer to a proper namespace e.g. { .Release.Namespace }

Desktop (please complete the following information):

  • Linux

Pulsar Proxy gets into odd state & won't come up

Describe the bug

[conf/proxy.conf] Applying config authenticationEnabled = true
[conf/proxy.conf] Applying config authenticationProviders = org.apache.pulsar.broker.authentication.AuthenticationProviderToken
[conf/proxy.conf] Applying config brokerClientAuthenticationParameters = file:///pulsar/tokens/proxy/token
[conf/proxy.conf] Applying config brokerClientAuthenticationPlugin = org.apache.pulsar.client.impl.auth.AuthenticationToken
[conf/proxy.conf] Applying config brokerServiceURL = pulsar://dga-detection-pulsar-broker:6650
[conf/proxy.conf] Applying config brokerWebServiceURL = http://dga-detection-pulsar-broker:8080
[conf/proxy.conf] Applying config httpNumThreads = 8
[conf/proxy.conf] Applying config servicePort = 6650
[conf/proxy.conf] Applying config statusFilePath = /pulsar/status
[conf/proxy.conf] Applying config tokenPublicKey = file:///pulsar/keys/token/public.key
[conf/proxy.conf] Applying config webServicePort = 80
[conf/pulsar_env.sh] Applying config PULSAR_GC = " -XX:+UseG1GC -XX:MaxGCPauseMillis=10 "

[conf/pulsar_env.sh] Applying config PULSAR_MEM = " -Xms1024m -Xmx4096m -XX:MaxDirectMemorySize=4096m -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem "

[conf/pulsar_env.sh] Applying config PULSAR_GC = " -XX:+UseG1GC -XX:MaxGCPauseMillis=10 "

[conf/pulsar_env.sh] Applying config PULSAR_MEM = " -Xms1024m -Xmx4096m -XX:MaxDirectMemorySize=4096m -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem "

03:23:26.784 [main] INFO  org.apache.pulsar.broker.authentication.AuthenticationService - org.apache.pulsar.broker.authentication.AuthenticationProviderToken has been loaded.
03:23:26.894 [main] INFO  org.eclipse.jetty.util.log - Logging initialized @1494ms to org.eclipse.jetty.util.log.Slf4jLog
03:23:27.015 [main] INFO  org.apache.pulsar.proxy.server.ProxyService - Started Pulsar Proxy at /0.0.0.0:6650
03:23:27.110 [main] INFO  org.eclipse.jetty.server.Server - jetty-9.4.20.v20190813; built: 2019-08-13T21:28:18.144Z; git: 84700530e645e812b336747464d6fbbf370c9a20; jvm 1.8.0_232-b09
03:23:27.134 [main] INFO  org.eclipse.jetty.server.session - DefaultSessionIdManager workerName=node0
03:23:27.134 [main] INFO  org.eclipse.jetty.server.session - No SessionScavenger set, using defaults
03:23:27.136 [main] INFO  org.eclipse.jetty.server.session - node0 Scavenging every 600000ms
03:23:27.144 [main] INFO  org.eclipse.jetty.server.handler.ContextHandler - Started o.e.j.s.ServletContextHandler@1ac85b0c{/metrics,null,AVAILABLE}
03:23:27.649 [main] INFO  org.eclipse.jetty.server.handler.ContextHandler - Started o.e.j.s.ServletContextHandler@3dd69f5a{/,null,AVAILABLE}
03:23:27.675 [main] INFO  org.eclipse.jetty.util.thread.ThreadPoolBudget - ReservedThreadExecutor@409986fe{s=0/1,p=0} requires 1 threads from WebExecutorThreadPool[etp425015667]@19553973{STARTED,8<=8<=8,i=8,q=0,ReservedThreadExecutor@409986fe{s=0/1,p=0}}
03:23:27.675 [main] INFO  org.eclipse.jetty.util.thread.ThreadPoolBudget - ClientSelectorManager@19b047fe{STARTING} requires 8 threads from WebExecutorThreadPool[etp425015667]@19553973{STARTED,8<=8<=8,i=8,q=0,ReservedThreadExecutor@409986fe{s=0/1,p=0}}
03:23:27.676 [main] WARN  org.eclipse.jetty.server.handler.ContextHandler.admin - unavailable
javax.servlet.ServletException: java.lang.IllegalStateException: Insufficient configured threads: required=9 < max=8 for WebExecutorThreadPool[etp425015667]@19553973{STARTED,8<=8<=8,i=8,q=0,ReservedThreadExecutor@409986fe{s=0/1,p=0}}
	at org.apache.pulsar.proxy.server.AdminProxyHandler.createHttpClient(AdminProxyHandler.java:138) ~[org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
	at org.eclipse.jetty.proxy.AbstractProxyServlet.init(AbstractProxyServlet.java:130) ~[org.eclipse.jetty-jetty-proxy-9.4.20.v20190813.jar:9.4.20.v20190813]
	at javax.servlet.GenericServlet.init(GenericServlet.java:244) ~[javax.servlet-javax.servlet-api-3.1.0.jar:3.1.0]
	at org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:656) ~[org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:421) ~[org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:746) ~[org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
	at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357) [?:1.8.0_232]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483) [?:1.8.0_232]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) [?:1.8.0_232]
	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313) [?:1.8.0_232]
	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743) [?:1.8.0_232]
	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) [?:1.8.0_232]
	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) [?:1.8.0_232]
	at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:739) [org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:361) [org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:821) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:276) [org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.handler.StatisticsHandler.doStart(StatisticsHandler.java:255) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.Server.start(Server.java:407) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.server.Server.doStart(Server.java:371) [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.apache.pulsar.proxy.server.WebServer.start(WebServer.java:202) [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
	at org.apache.pulsar.proxy.server.ProxyServiceStarter.<init>(ProxyServiceStarter.java:168) [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
	at org.apache.pulsar.proxy.server.ProxyServiceStarter.main(ProxyServiceStarter.java:172) [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
Caused by: java.lang.IllegalStateException: Insufficient configured threads: required=9 < max=8 for WebExecutorThreadPool[etp425015667]@19553973{STARTED,8<=8<=8,i=8,q=0,ReservedThreadExecutor@409986fe{s=0/1,p=0}}
	at org.eclipse.jetty.util.thread.ThreadPoolBudget.check(ThreadPoolBudget.java:156) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.ThreadPoolBudget.leaseTo(ThreadPoolBudget.java:130) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.ThreadPoolBudget.leaseFrom(ThreadPoolBudget.java:182) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.io.SelectorManager.doStart(SelectorManager.java:255) ~[org.eclipse.jetty-jetty-io-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.client.AbstractConnectorHttpClientTransport.doStart(AbstractConnectorHttpClientTransport.java:64) ~[org.eclipse.jetty-jetty-client-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.client.HttpClient.doStart(HttpClient.java:244) ~[org.eclipse.jetty-jetty-client-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) ~[org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
	at org.apache.pulsar.proxy.server.AdminProxyHandler.createHttpClient(AdminProxyHandler.java:126) ~[org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
	... 39 more
03:23:27.685 [main] INFO  org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@3dd69f5a{/,null,UNAVAILABLE}
03:23:27.685 [main] INFO  org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@1ac85b0c{/metrics,null,UNAVAILABLE}
03:23:27.685 [main] INFO  org.eclipse.jetty.server.session - node0 Stopped scavenging
2020-05-28 03:23:27,686 [sun.misc.Launcher$AppClassLoader@18769467] error Uncaught exception in thread main: Failed to start HTTP server on ports [80]

To Reproduce
Steps to reproduce the behavior:

  1. Had a nodes running Pulsar
  2. Created another group of nodes
  3. Terminated nodes
  4. Proxy never recovered even when I tried stateful scale of 0 and the back up to 3

Expected behavior
Proxy recovers. Not sure why proxy is trying to get 9 threads when it says numThreads 8.

template error in 2.6.0-2

Describe the bug
Template error when trying to install the newest verision.

Error: template: pulsar/templates/zookeeper-podmonitor.yaml:21:8: executing "pulsar/templates/zookeeper-podmonitor.yaml" at <$.Values.zookeeper.podMonitor.enabled>: nil pointer evaluating interface {}.enabled

To Reproduce
Steps to reproduce the behavior:

  1. download newest master release
  2. unpack it
  3. helm install --set initialize=true pulsar -f pulsar-2.6.0-2/values.yaml ./pulsar-2.6.0-2/
    Expected behavior
    a succesfull install of helm chart

Screenshots
image

Desktop (please complete the following information):

  • OS: IOS

Additional context
had hoped this release would fix the init container errors from pulsar-broker.

Error Reading Functions Worker ConfigMap

Describe the bug
When enabling functions, the broker throws the following error:

Error while trying to fetch configmap pulsar-functions-worker-config at namespace

To Reproduce
Steps to reproduce the behavior:

  1. Set Components.functions to True (true by default)
  2. View broker logs
  3. See error

Expected behavior
I think the role needs to be bound to the broker service account. It currently creates another service account named pulsar-functions-worker. Not sure if that SA is used by the functions, but the pulsar-broker-acct does not have access to the function worker config map.

What is the pulsar-functions-worker used for? Is it used by the functions? If so we can add read access to the pulsar-broker-acct role that is bound so it can read the ConfigMap.

Pods of broker/proxy/recovery init failed when enabled tls

Describe the bug
Pods of broker/proxy/recovery init failed when enabled tls

To Reproduce
Install commands:

git clone https://github.com/apache/pulsar-helm-chart.git ./
cd pulsar-helm-chart/

./scripts/cert-manager/install-cert-manager.sh
./scripts/pulsar/prepare_helm_release.sh -c -n pulsar -k pulsar

helm upgrade --install pulsar charts/pulsar \
    --set namespace=pulsar --set volumes.local_storage=true --set certs.internal_issuer.enabled=true \
    --set tls.enabled=true --set tls.proxy.enabled=true  --set tls.broker.enabled=true  --set tls.bookie.enabled=true \
    --set tls.zookeeper.enabled=true  --set tls.autorecovery.enabled=true  --set tls.toolset.enabled=true \
    --set auth.authentication.enabled=true --set auth.authorization.enabled=true -n pulsar

Unexpected behavior

Pods of broker/proxy/recovery stucked in the Init status

kubectl get pods -n pulsar
NAME                                     READY   STATUS      RESTARTS   AGE
pulsar-bookie-0                          1/1     Running     0          46m
pulsar-bookie-1                          1/1     Running     0          46m
pulsar-bookie-2                          1/1     Running     0          46m
pulsar-bookie-3                          1/1     Running     0          46m
pulsar-bookie-init-l9zdv                 0/1     Completed   0          46m
pulsar-broker-0                          0/1     Init:0/2    0          46m
pulsar-broker-1                          0/1     Init:0/2    0          46m
pulsar-broker-2                          0/1     Init:0/2    0          46m
pulsar-grafana-5ffd75b49d-g658b          1/1     Running     0          46m
pulsar-prometheus-5f957bf77-6mj2z        1/1     Running     0          46m
pulsar-proxy-0                           0/1     Init:1/2    0          46m
pulsar-proxy-1                           0/1     Init:1/2    0          46m
pulsar-proxy-2                           0/1     Init:1/2    0          46m
pulsar-pulsar-init-mqsvt                 1/1     Running     0          46m
pulsar-pulsar-manager-767d5f5766-khpr4   1/1     Running     0          46m
pulsar-recovery-0                        0/1     Init:0/1    0          46m
pulsar-toolset-0                         1/1     Running     0          46m
pulsar-zookeeper-0                       1/1     Running     0          46m
pulsar-zookeeper-1                       1/1     Running     0          46m
pulsar-zookeeper-2                       1/1     Running     0          45m

Check file /pulsar/certs/broker/tls.crt failed when init container started

kubectl logs pulsar-broker-0 -c wait-zookeeper-ready -n pulsar | head -8
processing /pulsar/certs/broker/tls.crt : len = 0
/pulsar/certs/broker/tls.crt is empty
JMX enabled by default
Connecting to pulsar-zookeeper:2281
...

When I check it, tls files had generated

kubectl exec -it  pulsar-broker-0 -c wait-zookeeper-ready -n pulsar /bin/bash
ls -al /pulsar/certs/broker/tls.crt
lrwxrwxrwx 1 root root 14 Jun 24 10:06 /pulsar/certs/broker/tls.crt -> ..data/tls.crt

If I re-run the following command:

/pulsar/keytool/keytool.sh broker ${HOSTNAME}.pulsar-broker.pulsar.svc.cluster.local true;

The init container will be successful exit, and pod will running

kubectl get pods -n pulsar | grep 'pulsar-broker-0'
pulsar-broker-0                          1/1     Running     0          71m

Restarting pods after config map change

Is your feature request related to a problem? Please describe.
Pods are not restarting when config maps are changed after changing values.yaml file, so they need to be restarted manually in order to pick up new values from config map.

Describe the solution you'd like
In my case pods need to restart after corresponding config map change, so one of solutions that developer from pulsar suggested me is adding this https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments, so it has to be added in every statefulset in templates.

Describe alternatives you've considered
As I mentioned alternative is to delete pods manually, or some other solution that I am not aware now.

Additional context
So is this conclusion correct, do pods need to be restarted after config maps change?

Bookie startup fails with JDK11 based docker image and default values.yaml JVM settings

Describe the bug

starting bookie fails with this error when a JDK11 based docker image is used (apache/pulsar master branch)

Unrecognized VM option 'PrintGCTimeStamps'
Error: Could not create the Java Virtual Machine.

and

Unrecognized VM option 'PrintGCApplicationStoppedTime'
Error: Could not create the Java Virtual Machine.

and

Unrecognized VM option 'PrintHeapAtGC'
Error: Could not create the Java Virtual Machine.

and

Unrecognized VM option 'G1LogLevel=finest'
Error: Could not create the Java Virtual Machine.

and

[0.002s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/pulsar/logs/bookie-gc.log instead.
[0.006s][warning][gc] -XX:+PrintGCDetails is deprecated. Will use -Xlog:gc* instead.

A clear and concise description of what the bug is.

To Reproduce

  1. build docker images from apache/pulsar master branch
  2. publish to some docker repository
  3. reference the images in a custom values.yaml
  4. deploy pulsar helm chart

Expected behavior

Pulsar helm chart should work with JDK11 base images.

Helm chart repo not tracking latest commits

Describe the bug
The pulsar Helm chart repository is not tracking the last 2 commits from the master branch (667e634, c2f6728).

To Reproduce
Steps to reproduce the behaviour:

  1. Go to 'https://pulsar.apache.org/charts/index.yaml'
  2. See that the most up-to-date version is:
  - apiVersion: v1
    appVersion: 2.7.0
    created: "2021-01-08T05:27:15.548211942Z"
    description: Apache Pulsar Helm chart for Kubernetes
    digest: bdbb841e75b5ed1fc9e65986def399690270c00c9282e0d616ddfa66faf2c558
    home: https://pulsar.apache.org
    icon: http://pulsar.apache.org/img/pulsar.svg
    maintainers:
    - email: [email protected]
      name: The Apache Pulsar Team
    name: pulsar
    sources:
    - https://github.com/apache/pulsar
    urls:
    - https://github.com/apache/pulsar-helm-chart/releases/download/pulsar-2.7.0-1/pulsar-2.7.0-1.tgz
    version: 2.7.0-1

Alternatively, you can do helm pull apache/pulsar --version=2.7.0-1 --untar=true

Notice that the following are missing:

Expected behaviour
The latest chart version should include.

annotations/`checksum/config` indentation issues

Describe the bug
With #73, annotations were introduced that would force a restart when the related configmap changes. This, however, did not always take into account the correct indentation of the annotation.

To Reproduce
See grafana-deployment, prometheus-deployment, pulsar-manager-deployment, toolset-statefulset and zookeeper-statefulset..

Expected behavior
Indentation should be correct, such that checksum/config is a field of annotations instead of metadata (where it is ignored or fails linting)

Bug in podAntiAffinity labels

Describe the bug
I've noticed, that multiple zookeeper pods started on same kubernetes node. It looks like issue with labels and podAntiAffinity settings.
Example using zookeeper

koziorow@pl1lxl-108450:~/projects/LNS-DevEnv/repo/batch-service-infra/terraform-apache-pulsar$ kubectl describe pod pulsar-zookeeper-0
Name: pulsar-zookeeper-0
Namespace: default
Priority: 0
Node: aks-default-12797261-vmss000001/10.240.0.115
Start Time: Mon, 13 Jul 2020 12:22:02 +0200
Labels: app=pulsar
cluster=pulsar
component=zookeeper
controller-revision-hash=pulsar-zookeeper-7d87bf67f9
release=pulsar
statefulset.kubernetes.io/pod-name=pulsar-zookeeper-0

In chart:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- "{{ template "pulsar.name" . }}-{{ .Values.zookeeper.component }}"
- key: "release"
operator: In
values:
- {{ .Release.Name }}
- key: "component"
operator: In
values:
- {{ .Values.zookeeper.component }}
topologyKey: "kubernetes.io/hostname"

Expected behavior
Zookeeper pods should start on different nodes. Problem exists also in other components.
BTW. Maybe it would be useful to use recommended helm labels? https://helm.sh/docs/chart_best_practices/labels/
Also I'm not sure, but there could be also copy-paste issue in https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/templates/autorecovery-statefulset.yaml with:

              - key: "component"
                operator: In
                values:
                - {{ .Values.bookkeeper.component }}

Updating Helm chart to support GCP cert provider for TLS

Linking from Apache/Pulsar repo: apache/pulsar#8457
(I discovered that the issue probably should be filed here.)

In most production environments, using self-signed certs is not acceptable for TLS. Certs are expected to be backed by a CA for security reasons. It appears that the Pulsar Helm charts currently only support self-signed certificates.
The doc https://pulsar.apache.org/docs/en/helm-overview/ seems to suggest that the Helm chart also supports Let's Encrypt, but the Helm chart template appears to only accept "selfsigning" as a parameter: https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/templates/tls-cert-internal-issuer.yaml#L21

It would be helpful to also support GCP as a cert provider for TLS. This article has some information on using cert-manager with GCP: https://cert-manager.io/docs/configuration/acme/dns01/google/

Helm chart dependencies do not follow best practices

Is your feature request related to a problem? Please describe.
Charts do not follow common guideline: chart templates bring dependencies into the main chart instead of using Helm dependency

Describe the solution you'd like
Move services like prometheus, alert-manager, zookeeper to a chart dependencies, That would simplified helm chart support and usage.

Describe alternatives you've considered
Do not see other options, cause the issue is that Chart structure is inconsistent and does not follow best practices

Additional context
As a chart reference with a lot of dependencies GitLab helm chart could be taken..

Add The Ability To Set Log Level

Is your feature request related to a problem? Please describe.
Setting PULSAR_LOG_ROOT_LEVEL and PULSAR_ROOT_LOGGER environment variables using configData does not change the log level since it uses the conf/log4j2.yaml .

Describe the solution you'd like
Ideally we would be able to only use environment variables, but that will not be possible until pulsar does not depend on the log4j2.yaml file. To get it working in its current state, we could create a ConfigMap with the contents of the logfile, then mount it as a volume and set the PULSAR_LOG_CONF to point to our mounted file instead of conf/log4j2.yaml.

Pulsar proxy fails to start with pulsar Docker image that uses non-root user

Problem

There's a permission issue in pulsar-proxy when using a Docker image that uses a non-root user.
The 2.8.0-SNAPSHOT Pulsar docker images use a non-root user. This change was made in apache/pulsar#8796 .

Since the default configuration uses port 80 and now when the default user is "pulsar", it cannot bind to port 80.

This is the error log in pulsar-proxy-0 pod:

14:38:21.477 [main] ERROR org.apache.pulsar.proxy.server.ProxyServiceStarter - Failed to start pulsar proxy service. error msg Failed to start HTTP server on ports [80]
java.io.IOException: Failed to start HTTP server on ports [80]
at org.apache.pulsar.proxy.server.WebServer.start(WebServer.java:246) ~[org.apache.pulsar-pulsar-proxy-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
at org.apache.pulsar.proxy.server.ProxyServiceStarter.<init>(ProxyServiceStarter.java:177) [org.apache.pulsar-pulsar-proxy-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
at org.apache.pulsar.proxy.server.ProxyServiceStarter.main(ProxyServiceStarter.java:186) [org.apache.pulsar-pulsar-proxy-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
Caused by: java.net.SocketException: Permission denied
at sun.nio.ch.Net.bind0(Native Method) ~[?:1.8.0_282]
at sun.nio.ch.Net.bind(Net.java:461) ~[?:1.8.0_282]
at sun.nio.ch.Net.bind(Net.java:453) ~[?:1.8.0_282]
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:222) ~[?:1.8.0_282]
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85) ~[?:1.8.0_282]
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:345) ~[org.eclipse.jetty-jetty-server-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:310) ~[org.eclipse.jetty-jetty-server-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80) ~[org.eclipse.jetty-jetty-server-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:234) ~[org.eclipse.jetty-jetty-server-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[org.eclipse.jetty-jetty-util-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.eclipse.jetty.server.Server.doStart(Server.java:401) ~[org.eclipse.jetty-jetty-server-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[org.eclipse.jetty-jetty-util-9.4.35.v20201120.jar:9.4.35.v20201120]
at org.apache.pulsar.proxy.server.WebServer.start(WebServer.java:224) ~[org.apache.pulsar-pulsar-proxy-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
... 2 more
2021-03-19 02:38:21,480 [sun.misc.Launcher$AppClassLoader@7ef20235] error Uncaught exception in thread main: java.io.IOException: Failed to start HTTP server on ports [80]

The problem goes away after changing the port to 8080 in values.yaml:

proxy:
  ports:
    http: 8080

Please add support for Tiered Storage in the Helm Chart

Is your feature request related to a problem? Please describe.
We'd love to see easier integration with Pulsar Tiered Storage including S3 bucket etc. and AWS credentials

Describe the solution you'd like
Please provide standard template / values for S3 integration (with appropriate AWS credentials / STS role)

Describe alternatives you've considered
From Addison in Slack https://apache-pulsar.slack.com/archives/CJ0FMGHSM/p1602085159071600

To enable it with the official charts you can just set the following properties:
    # enable tiered storage
    managedLedgerOffloadDriver: aws-s3
    s3ManagedLedgerOffloadBucket: <bucket name>
    s3ManagedLedgerOffloadRegion: <region>
    managedLedgerOffloadAutoTriggerSizeThresholdBytes: "262144000"

Additional context
It appears that Kafkaesque Helm chart currently supports it https://github.com/kafkaesque-io/pulsar-helm-chart#tiered-storage but we'd rather be on the "official" variant

Pulsar fails to properly recover & deletes all ledgers

Describe & Reproduce
Steps to reproduce the behavior:

  1. Create another node group
  2. Attempted graceful node transfer
  3. Issue happened where multiple nodes terminated and basically major amounts of Pulsar was impacted probably breaking the pod disruption budget
  4. Get Pulsar back up, all storage mounts, cluster decides those are abandoned at some point while other bookies come back online and I guess marks them for garbage collection?

Expected behavior
When cluster goes into an unstable state it doesn't take it upon itself to begin deleting Ledgers?

Here are the logs of the bookie that looked like it was up during the unstableness
delete.txt

pulsar deploy in k8s zookeeper error

Describe the bug
The Pulsar2.7.1 cluster was deployed on K8S using the officially provided Helm cluster, but the ZooKeeper cluster deployment could not complete the initialization of the cluster. Due to a problem with ZooKeeper, Pulsar-Proxy, Pulsar-Broker, Pulsar-Bookie, and Pulsar-Recovery are all processing the wait state and cannot be started

image

Updating existing pulsar helm setup results to an error

Describe the bug
Updating existing pulsar helm setup (e.g. changing replicas count) results to an error

To Reproduce

helm upgrade --install pulsar -n pulsar -f values.yaml apache/pulsar

Error: UPGRADE FAILED: cannot patch "pulsar-bookie-init" with kind Job: Job.batch "pulsar-bookie-init" is invalid: spec.template:
... : field is immutable

Expected behavior
Pulsar setup is updated

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Linux

Additional context
Probably Jobs should be handled with helm hooks i.e.:

apiVersion: batch/v1
kind: Job
metadata:
  name: example-job
  labels:
    release: {{ .Release.Name }}
    heritage: {{ .Release.Service }}
  annotations:
    helm.sh/hook: post-install
    helm.sh/hook-delete-policy: hook-succeeded
spec: ...

add variable to conf/zookeeper.conf

Describe the bug
By default, Zookeeper listens on the pod IP address for communication between servers. Istio and other service meshes require 0.0.0.0 to be the address to listen on.
There is a configuration parameter that can be used to change this default behavior: quorumListenOnAllIPs. This option allows Zookeeper to listen on all addresses. Set this parameter to true by using the following command where $ZK_CONFIG_FILE is your Zookeeper configuration file.

ref: https://istio.io/latest/faq/applications/#zookeeper

For workaround:
I have to disable recovery then install pulsar by helm with default setting.
I get the statefulset of zookeeper by: kubectl get statefulset pulsar-zookeeper -o yaml > statefulset-zk.yaml
add: echo "quorumListenOnAllIPs=true" >> conf/zookeeper.conf; to args of container -> apply to upgrade the statefulset -> delete 3 nodes zookeeper to restart.

2nd way is pull the chart, hence modify the template zookeeper-statefulset.yaml of pulsar.

I wonder whether is another way to add quorumListenOnAllIPs to zk_config_file to install with helm?
I tried to use configMap, but seem like it's not works.
Thanks

Missing arguments in prepare_helm_release.sh script

Expected behavior

As of the official documentation of 2.6.0, the prepare_helm_release.sh should be able to be configured with:

  • control-center-admin
  • control-center-password

Actual behavior

The result when running the script is:

unknown option: --control-center-admin

Steps to reproduce

System configuration

Pulsar version: 2.6.0

Grafana Ingress requires deprecated values

Hi all, a small problem with the chart and grafana ingress.

Describe the bug
To be able to deploy the ingress rules of Grafana component, we need to set the value of a deprecated options to true.

To Reproduce
Steps to reproduce the behavior:

  1. Go to values.yaml (L117), the monitoring: falsehas to be set to true to have Grafana Ingress deployed, but this value is part of the extra that have been deprecated.
  2. Check on grafana-ingress.yaml (L20, just after the licence), it check .Values.extra.monitoring

Expected behavior
No to use some extra values which are deprecated to be able to deploy ingress for Grafana.
(i.e. not to have L20 and L50 of grafana-ingress.yaml

I can do the changes and a PR if needed.

Param pulsar_manager.admin.user/pulsar_manager.admin.password used in the wrong place

Describe the bug
I am try to set pulsar_manager.admin.user/pulsar_manager.admin.password for login pulsar-manager ui, But it's actually used as the account/password for outside PostgreSQL instance.

To Reproduce
helm upgrade --install pulsar charts/pulsar --set pulsar_manager.admin.user=pulsarpulsar --set pulsar_manager.admin.password=pulsarpulsar -n pulsar

org.postgresql.util.PSQLException: FATAL: role "pulsarpulsar" does not exist
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) ~[postgresql-42.2.5.jar:42.2.5]
	at org.postgresql.core.v3.QueryExecutorImpl.readStartupMessages(QueryExecutorImpl.java:2559) ~[postgresql-42.2.5.jar:42.2.5]
	at org.postgresql.core.v3.QueryExecutorImpl.<init>(QueryExecutorImpl.java:133) ~[postgresql-42.2.5.jar:42.2.5]
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:250) ~[postgresql-42.2.5.jar:42.2.5]
	at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49) ~[postgresql-42.2.5.jar:42.2.5]
	at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:195) ~[postgresql-42.2.5.jar:42.2.5]
	at org.postgresql.Driver.makeConnection(Driver.java:454) ~[postgresql-42.2.5.jar:42.2.5]

TLS Authentication in Kubernetes, Pulsar 2.6.1 - Broker crash loop on startup due to 401 in WorkerService.start(..)

Copying from the Apache/Pulsar Github issue (apache/pulsar#8536):

Describe the bug
After configuring TLS Authentication in Pulsar 2.6.1 with this helm chart: https://github.com/devinbost/pulsar-helm-chart/tree/tls-auth
the broker gets stuck in a restart loop due to the WorkerService crashing with:

21:24:45.025 [pulsar-web-48-8] WARN org.apache.pulsar.broker.web.AuthenticationFilter - [10.244.0.9] Failed to authenticate HTTP request: Client unable to authenticate with TLS certificate
21:24:45.042 [pulsar-web-48-8] INFO org.eclipse.jetty.server.RequestLog - 10.244.0.9 - - [17/Nov/2020:21:24:44 +0000] "PUT /admin/v2/persistent/public/functions/assignments HTTP/1.1" 401 0 "-" "Pulsar-Java-v2.6.1" 63
21:24:45.042 [pulsar-web-48-1] INFO org.eclipse.jetty.server.RequestLog - 10.244.0.7 - - [17/Nov/2020:21:24:44 +0000] "GET /metrics HTTP/1.1" 302 0 "-" "Prometheus/2.17.2" 63
21:24:45.098 [AsyncHttpClient-64-1] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://pulsar-ci-broker-0.pulsar-ci-broker.pulsar.svc.cluster.local:8080/admin/v2/persistent/public/functions/assignments] Failed to perform http put request: javax.ws.rs.NotAuthorizedException: HTTP 401 Unauthorized
21:24:45.115 [main] ERROR org.apache.pulsar.functions.worker.WorkerService - Error Starting up in worker
org.apache.pulsar.client.admin.PulsarAdminException$NotAuthorizedException: HTTP 401 Unauthorized

during the WorkerService.start(..) method execution.

Edit:
After debugging, the issue is that the data is still unreadable after the decrypt step, so something is misconfigured with the certs.

To Reproduce
Steps to reproduce the behavior:

  1. Clone the tls-auth branch of my fork of the Pulsar helm chart by running:
git clone https://github.com/devinbost/pulsar-helm-chart.git
git checkout tls-auth
  1. Start minikube with an appropriate number of CPUs:
    minikube start --memory=8192 --cpus=6 --cni=bridge

  2. Run the following commands to setup the kubernetes environment, tokens, certs, and keys:

./scripts/cert-manager/install-cert-manager.sh
./scripts/pulsar/prepare_helm_release.sh -n pulsar -k pulsar-ci -c --pulsar-superusers superadmin,proxy-admin,broker-admin,client-admin,admin
  1. Install the local helm chart with the values file specified:
    helm install --values examples/values-minikube-with-tls-and-jwt.yaml pulsar-ci ./charts/pulsar/

  2. After waiting for a time, get logs from the broker:
    kubectl -n pulsar logs pulsar-ci-broker-0

The logs should demonstrate the problem.
Expected behavior
Decryption should be happening correctly, resulting in the correct auth headers passing when we execute a PUT on the function/assignments topic during broker start.

Environment

  • minikube v1.14.2 on Darwin 10.15.7
  • Kubernetes v1.19.2 on Docker 19.03.8 ...
  • Enabled addons: storage-provisioner, default-storageclass
  • kubectl is configured to use "minikube"

Code involved (edited)

When we create the brokerAdmin client, we use the pulsarWebServiceUrl: https://github.com/apache/pulsar/blob/master/pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/WorkerService.java#L146

The first PUT on the function assignment topic uses the brokerAdminclient here: https://github.com/apache/pulsar/blob/master/pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/WorkerService.java#L169

There must be a cert misconfiguration issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.