nuodb / nuodb-helm-charts Goto Github PK
View Code? Open in Web Editor NEWNuoDB Helm Chart for Kubernetes & OpenShift
License: BSD 3-Clause "New" or "Revised" License
NuoDB Helm Chart for Kubernetes & OpenShift
License: BSD 3-Clause "New" or "Revised" License
UPGRADE FAILED: StatefulSet.apps "admin-o4z7ys-nuodb-cluster0" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden
Hi,
When deploying nuodb helm admin/database charts we override/set (among other things):
admin.fullnameOverride=nuodbsvc-admin
database.fullnameOverride=nuodbsvc-inventory-database
admin.domain=nuodbsvc
We would like to see the pods prefixed with the admin domain name in order to segregate pod that belong to the same admin domain based on the prefix, to make them easy to spot when having other pods in kubernetes that's don't belong to nuodb and integrate well with the way we deploy nuodb in our system.
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
nuodbsvc-admin-0 1/1 Running 0 1h
nuodbsvc-inventory-database-sm-0 1/1 Running 0 2m
nuodbsvc-inventory-database-te-6d476c6c6-5fltd 1/1 Running 0 2m
nuodbsvc-job-lb-policy-nearest-nzdmd 0/1 Completed 0 1h
Currently:
SM statefulset name is defined as following
name: sm-{{ template "database.fullname" . }}
TE deployment name:
name: te-{{ template "database.fullname" . }}
Would it be okay to change the above to:
name: {{ template "database.fullname" . }}-sm
name: {{ template "database.fullname" . }}-te
respectively in order to get the desired pod name, opinion?
Thanks
NuoDB version: 4.0.7
NuoDB Helm charts: 2.4.0
Failure Description: NuoDB admin KAA module does not start
Log:
2020-08-27T14:48:33.553+0000 WARN io.fabric8.kubernetes.client.informers.cache.Controller informer-controller-DaemonSet Reflector list-watching job exiting because the thread-pool is shutting down
java.util.concurrent.RejectedExecutionException: Error while starting ReflectorRunnable watch
at io.fabric8.kubernetes.client.informers.cache.Reflector.listAndWatch(Reflector.java:85)
at io.fabric8.kubernetes.client.informers.cache.Controller.run(Controller.java:112)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.RejectedExecutionException: Error while doing ReflectorRunnable list
at io.fabric8.kubernetes.client.informers.cache.Reflector.getList(Reflector.java:73)
at io.fabric8.kubernetes.client.informers.cache.Reflector.reListAndSync(Reflector.java:94)
at io.fabric8.kubernetes.client.informers.cache.Reflector.listAndWatch(Reflector.java:80)
... 2 common frames omitted
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.96.0.1/apis/apps/v1/namespaces/testadminscaledown-riqkzt/daemonsets. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. daemonsets.apps is forbidden: User "system:serviceaccount:testadminscaledown-riqkzt:nuodb" cannot list resource "daemonsets" in API group "apps" in the namespace "testadminscaledown-riqkzt".
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:505)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:412)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:166)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:640)
at io.fabric8.kubernetes.client.informers.SharedInformerFactory$1.list(SharedInformerFactory.java:161)
at io.fabric8.kubernetes.client.informers.SharedInformerFactory$1.list(SharedInformerFactory.java:154)
at io.fabric8.kubernetes.client.informers.cache.Reflector.getList(Reflector.java:67)
... 4 common frames omitted
database has values options to set topologySpreadConstraints which is better for managing spreading nodes across availability zones.
Currently, the YCSB chart hard-codes the number of threads to 2. We wanna make it customizable.
We should enable users of the YCSB demo/workload to use either Read Committed or Consistent Read (default) via a helm parameter
Describe the bug
The bootstrapServers
label and nuodb.com/bootstrap-servers
annotation, which allow an initial membership to be defined containing an arbitrary number of members, breaks upgrade from deployments prior to 2.4.0 of nuodb-helm-charts or from NuoDB releases prior to 4.0.7 to 4.0.7 with 2.4.0 of the nuodb-helm-charts.
Context
NuoDB Version: 4.0.6 to 4.0.7
Helm Charts Version: 2.4.0
Kubernetes Version: any
Environment: any
To Reproduce
Steps to reproduce the behavior:
Expected behavior
NuoDB upgrades as expected
This property in the values.yaml
of the database chart is misleading as it only applies to SMs.
A comment that it is defining the archive volume would be useful also, it is not IMHO obvious what it is for given its current location.
This seems to be the reason why TestUpgradeHelmFullDB in the continuous tests is broken. None of the past versions of the NuoDB Helm charts are available in the NuoDB repository:
$ helm repo add nuodb http://storage.googleapis.com/nuodb-charts
"nuodb" has been added to your repositories
$ helm search repo nuodb
NAME CHART VERSION APP VERSION DESCRIPTION
nuodb/admin 3.0.0 4.0.0 Administration tier for NuoDB.
nuodb/database 3.0.0 4.0.0 NuoDB distributed SQL database.
nuodb/restore 3.0.0 4.0.0 On-demand restore a NuoDB SQL database.
nuodb/storage-class 3.0.0 4.0.0 Storage classes for NuoDB.
nuodb/transparent-hugepage 3.0.0 4.0.0 Disable disables transparent_hugepage on Linux ...
$ helm install nuodb/admin --version=2.4.1 --generate-name
Error: failed to download "nuodb/admin" (hint: running `helm repo update` may help)
$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "nuodb" chart repository
Update Complete. ⎈ Happy Helming!⎈
$ helm install nuodb/admin --version=2.4.1 --generate-name
Error: failed to download "nuodb/admin" (hint: running `helm repo update` may help)
This paragraph is misleading:
If you already have an OpenShift project you want to use, select it as follows:
oc project ${TILLER_NAMESPACE}
I think it should be:
If you already have an OpenShift project you want to use, select it as follows:
export TILLER_NAMESPACE=<your-existing-namespace>
oc project ${TILLER_NAMESPACE}
This is a request for a standalone howto document for the Vault integration.
This section assumes you have OpenShift and the oc
utility.
You do not need oc
to use Helm and we should show how to setup and remove Tiller for generic Kubernetes, which doesn't have Projects either.
When you delete the disk (PV/PVC) of the RAFT log of admin-0 in a multi-admin domain, the newly recreated admin pod will not join the existing RAFT domain. It will instead create a fresh domain consisting of self.
The admin helm chart creates two containers. One disables THP, the other runs watch
on the THP settings file every 10mins:
Why is the watch
container useful? What is is for?
The watch
container does not mount /sys
so it cannot access the THP settings file and the log contains file not found
errors.
Describe the bug
After a few seconds the admin gives up on waiting for the engine processes to reconnect. Logging looks like this
2021-08-12T18:06:23.390+0000 INFO [admin-31wu94-nuodb-cluster0-0:processManagerScheduled29-1] ProcessManager Removing process with connectKey=1647498425067523578, startId=0, reason=Timed out (60000ms) awaiting connection from process: connectedState=PENDING_RECONNECT, removeAction=NONE, exitCode=null
2021-08-12T18:06:23.394+0000 INFO [admin-31wu94-nuodb-cluster0-0:processManagerScheduled29-1] ProcessManager Removing process with connectKey=1026677207646886664, startId=1, reason=Timed out (60000ms) awaiting connection from process: connectedState=PENDING_RECONNECT, removeAction=NONE, exitCode=null
Or
2021-08-12T18:06:23.484+0000 WARN [:RetryingRaftClientExec4-1] DomainProcessStateMachine admin-31wu94-nuodb-cluster0-0: No entry found for process with startId=1
Also
2021-08-12T18:07:15.393+0000 WARN [admin-31wu94-nuodb-cluster0-0:tagServerExecutor31-2] ProcessManager Evicting unknown reconnecting process EngineNode{databaseName=demo, address=te-database-klkfbv-nuodb-cluster0-demo-69d99ff6bd-fkn7j, port=48006, type=TE, pid=44, state=RUNNING, nodeId=2, version=4.2.1.vee-3, ipAddress=10.1.5.157, hostname=te-database-klkfbv-nuodb-cluster0-demo-69d99ff6bd-fkn7j}
java.lang.NullPointerException: Expected to find durable process with startId=1
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:987)
at com.nuodb.host.ProcessManager$AdminConnectionImpl.setAltAddress(ProcessManager.java:952)
at com.nuodb.host.ProcessManager$AdminConnectionImpl.reconnect(ProcessManager.java:910)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.nuodb.server.TagMessageDispatcher.dispatch(TagMessageDispatcher.java:74)
at com.nuodb.server.TagMessageDispatcher.dispatch(TagMessageDispatcher.java:44)
at com.nuodb.server.BoundMessageDispatcher.dispatch(BoundMessageDispatcher.java:28)
at com.nuodb.host.requests.RequestDispatcher.dispatch(RequestDispatcher.java:97)
at com.nuodb.host.requests.RequestDispatcher.dispatch(RequestDispatcher.java:34)
at com.nuodb.server.BoundMessageDispatcher.dispatch(BoundMessageDispatcher.java:28)
at com.nuodb.server.Server.consumeMessages(Server.java:152)
at com.nuodb.server.Server.lambda$acceptConnections$1(Server.java:121)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at com.nuodb.util.Threading$1.lambda$wrapTarget$0(Threading.java:29)
at java.base/java.lang.Thread.run(Thread.java:829)
Context
NuoDB Version: any
Helm Charts Version: any
Kubernetes Version: 1.15+
Environment: any
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The engine reconnects successfully.
Additional context
Add any other context about the problem here.
External Reference
ZenDesk/JIRA numbers if available
JIRA: DB-33760
Change readinessTimeoutSeconds
to 5s (currently is 1) to match same value in database charts.
Teaching Temenos today we had exactly this issue. Extending the timeout fixed it.
Describe the bug
All the examples are using helm 2 syntax.
The helm 3 syntax is helm install [NAME] [CHART] [flags]
. Your examples are all of the form helm install [CHART] ...
and use --name
to specify the installation name (which is helm 2 syntax).
Context
NuoDB Version: any 4.x
Helm Charts Version: all - I have no idea when (or if) these charts are supposed to target helm 3.
Kubernetes Version: any
Environment: any
To Reproduce
Just look at any of the README.md
documents. In particular the README.md
in the root of project and the README.md
in each chart. Several charts have multiple example commands.
Expected behavior
Either explicitly mention that you are using helm 2 syntax, or upgrade to helm 3 syntax or show both. Your call.
External Reference
Is a Jira required?
## database-wide options.
# These are applied using the --database-options on the startup command
# change these to values appropriate for this database
# these options are applied to all processes in the database.
options:
ping-timeout: 60
max-lost-archives: 0
K8s v1.15.5 running in minikube.
tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$ kc get pods
NAME READY STATUS RESTARTS AGE
example-admin-nuodb-cluster0-0 1/1 Running 0 98s
job-lb-policy-nearest-6wlwc 0/1 Error 0 73s
job-lb-policy-nearest-rgnjx 0/1 Error 0 98s
job-lb-policy-nearest-xxsl9 0/1 Completed 0 63s
nuodb-helm-operator-7cf7c858d5-462mj 1/1 Running 0 2m58s
tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$
One of the failed tries:
tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$ kc logs job-lb-policy-nearest-rgnjx
Unable to connect to https://nuodb.nuodb.svc:8888: HTTPSConnectionPool(host='nuodb.nuodb.svc', port=8888): Max retries exceeded with url: /api/1/databases/loadBalancerPolicy/nearest (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7efd2d1ad410>: Failed to establish a new connection: [Errno -2] Name or service not known',))
tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$
Maybe there is some kind of race condition before the service is created?
Enabling the SM backup pod without providing any backup SMs will cause the initial backup job to fail and be rescheduled instantly, so hundreds of failed jobs will start piling up. This is probably not what we want.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.