Code Monkey home page Code Monkey logo

nuodb-helm-charts's People

Contributors

acabrele avatar adriansuarez avatar agoelzer avatar akiros001 avatar butson avatar dependabot[bot] avatar jleslie85 avatar kmabda avatar kontaras avatar mkysel avatar nexflare avatar niktj777 avatar paulc4 avatar rbuck avatar rkampurath avatar silentrangerr avatar sivanov-nuodb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nuodb-helm-charts's Issues

TE/SM statefulset/deployment names

Hi,

When deploying nuodb helm admin/database charts we override/set (among other things):
admin.fullnameOverride=nuodbsvc-admin
database.fullnameOverride=nuodbsvc-inventory-database
admin.domain=nuodbsvc

We would like to see the pods prefixed with the admin domain name in order to segregate pod that belong to the same admin domain based on the prefix, to make them easy to spot when having other pods in kubernetes that's don't belong to nuodb and integrate well with the way we deploy nuodb in our system.

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
nuodbsvc-admin-0 1/1 Running 0 1h
nuodbsvc-inventory-database-sm-0 1/1 Running 0 2m
nuodbsvc-inventory-database-te-6d476c6c6-5fltd 1/1 Running 0 2m
nuodbsvc-job-lb-policy-nearest-nzdmd 0/1 Completed 0 1h

Currently:
SM statefulset name is defined as following
name: sm-{{ template "database.fullname" . }}
TE deployment name:
name: te-{{ template "database.fullname" . }}

Would it be okay to change the above to:
name: {{ template "database.fullname" . }}-sm
name: {{ template "database.fullname" . }}-te

respectively in order to get the desired pod name, opinion?

Thanks

Forbidden!Configured service account doesn't have access. Service account may have been revoked. daemonsets.apps is forbidden

NuoDB version: 4.0.7
NuoDB Helm charts: 2.4.0

Failure Description: NuoDB admin KAA module does not start

Log:

2020-08-27T14:48:33.553+0000 WARN  io.fabric8.kubernetes.client.informers.cache.Controller informer-controller-DaemonSet Reflector list-watching job exiting because the thread-pool is shutting down
java.util.concurrent.RejectedExecutionException: Error while starting ReflectorRunnable watch
        at io.fabric8.kubernetes.client.informers.cache.Reflector.listAndWatch(Reflector.java:85)
        at io.fabric8.kubernetes.client.informers.cache.Controller.run(Controller.java:112)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.RejectedExecutionException: Error while doing ReflectorRunnable list
        at io.fabric8.kubernetes.client.informers.cache.Reflector.getList(Reflector.java:73)
        at io.fabric8.kubernetes.client.informers.cache.Reflector.reListAndSync(Reflector.java:94)
        at io.fabric8.kubernetes.client.informers.cache.Reflector.listAndWatch(Reflector.java:80)
        ... 2 common frames omitted
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.96.0.1/apis/apps/v1/namespaces/testadminscaledown-riqkzt/daemonsets. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. daemonsets.apps is forbidden: User "system:serviceaccount:testadminscaledown-riqkzt:nuodb" cannot list resource "daemonsets" in API group "apps" in the namespace "testadminscaledown-riqkzt".
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:505)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:412)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:166)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:640)
        at io.fabric8.kubernetes.client.informers.SharedInformerFactory$1.list(SharedInformerFactory.java:161)
        at io.fabric8.kubernetes.client.informers.SharedInformerFactory$1.list(SharedInformerFactory.java:154)
        at io.fabric8.kubernetes.client.informers.cache.Reflector.getList(Reflector.java:67)
        ... 4 common frames omitted

NuoDB admin does not start up with "Shutting down server due to mismatching initial membership" during upgrade to NuoDB 4.0.7

Describe the bug
The bootstrapServers label and nuodb.com/bootstrap-servers annotation, which allow an initial membership to be defined containing an arbitrary number of members, breaks upgrade from deployments prior to 2.4.0 of nuodb-helm-charts or from NuoDB releases prior to 4.0.7 to 4.0.7 with 2.4.0 of the nuodb-helm-charts.

Context
NuoDB Version: 4.0.6 to 4.0.7
Helm Charts Version: 2.4.0
Kubernetes Version: any
Environment: any

To Reproduce
Steps to reproduce the behavior:

  1. Create domain with NuoDB 4.0.7 and helm charts 2.3.1
  2. upgrade to newer helm charts

Expected behavior
NuoDB upgrades as expected

database.persistence should be database.sm.persistence

This property in the values.yaml of the database chart is misleading as it only applies to SMs.

A comment that it is defining the archive volume would be useful also, it is not IMHO obvious what it is for given its current location.

Helm chart repository seems to be broken

This seems to be the reason why TestUpgradeHelmFullDB in the continuous tests is broken. None of the past versions of the NuoDB Helm charts are available in the NuoDB repository:

$ helm repo add nuodb http://storage.googleapis.com/nuodb-charts
"nuodb" has been added to your repositories

$ helm search repo nuodb
NAME                      	CHART VERSION	APP VERSION	DESCRIPTION                                       
nuodb/admin               	3.0.0        	4.0.0      	Administration tier for NuoDB.                    
nuodb/database            	3.0.0        	4.0.0      	NuoDB distributed SQL database.                   
nuodb/restore             	3.0.0        	4.0.0      	On-demand restore a NuoDB SQL database.           
nuodb/storage-class       	3.0.0        	4.0.0      	Storage classes for NuoDB.                        
nuodb/transparent-hugepage	3.0.0        	4.0.0      	Disable disables transparent_hugepage on Linux ...

$ helm install nuodb/admin --version=2.4.1 --generate-name 
Error: failed to download "nuodb/admin" (hint: running `helm repo update` may help)

$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "nuodb" chart repository
Update Complete. ⎈ Happy Helming!⎈ 

$ helm install nuodb/admin --version=2.4.1 --generate-name 
Error: failed to download "nuodb/admin" (hint: running `helm repo update` may help)

Small change to README

This paragraph is misleading:

If you already have an OpenShift project you want to use, select it as follows:

oc project ${TILLER_NAMESPACE}

I think it should be:

If you already have an OpenShift project you want to use, select it as follows:

export TILLER_NAMESPACE=<your-existing-namespace>
oc project ${TILLER_NAMESPACE}

What about non OpenShift users?

This section assumes you have OpenShift and the oc utility.

You do not need oc to use Helm and we should show how to setup and remove Tiller for generic Kubernetes, which doesn't have Projects either.

Error in admin helm chart?

The admin helm chart creates two containers. One disables THP, the other runs watch on the THP settings file every 10mins:

  1. Why is the watch container useful? What is is for?

  2. The watch container does not mount /sys so it cannot access the THP settings file and the log contains file not found errors.

Engine does not reconnect to a restarted admin container

Describe the bug
After a few seconds the admin gives up on waiting for the engine processes to reconnect. Logging looks like this

2021-08-12T18:06:23.390+0000 INFO  [admin-31wu94-nuodb-cluster0-0:processManagerScheduled29-1] ProcessManager Removing process with connectKey=1647498425067523578, startId=0, reason=Timed out (60000ms) awaiting connection from process: connectedState=PENDING_RECONNECT, removeAction=NONE, exitCode=null
2021-08-12T18:06:23.394+0000 INFO  [admin-31wu94-nuodb-cluster0-0:processManagerScheduled29-1] ProcessManager Removing process with connectKey=1026677207646886664, startId=1, reason=Timed out (60000ms) awaiting connection from process: connectedState=PENDING_RECONNECT, removeAction=NONE, exitCode=null

Or

2021-08-12T18:06:23.484+0000 WARN  [:RetryingRaftClientExec4-1] DomainProcessStateMachine admin-31wu94-nuodb-cluster0-0: No entry found for process with startId=1

Also

2021-08-12T18:07:15.393+0000 WARN  [admin-31wu94-nuodb-cluster0-0:tagServerExecutor31-2] ProcessManager Evicting unknown reconnecting process EngineNode{databaseName=demo, address=te-database-klkfbv-nuodb-cluster0-demo-69d99ff6bd-fkn7j, port=48006, type=TE, pid=44, state=RUNNING, nodeId=2, version=4.2.1.vee-3, ipAddress=10.1.5.157, hostname=te-database-klkfbv-nuodb-cluster0-demo-69d99ff6bd-fkn7j}
java.lang.NullPointerException: Expected to find durable process with startId=1
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:987)
	at com.nuodb.host.ProcessManager$AdminConnectionImpl.setAltAddress(ProcessManager.java:952)
	at com.nuodb.host.ProcessManager$AdminConnectionImpl.reconnect(ProcessManager.java:910)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at com.nuodb.server.TagMessageDispatcher.dispatch(TagMessageDispatcher.java:74)
	at com.nuodb.server.TagMessageDispatcher.dispatch(TagMessageDispatcher.java:44)
	at com.nuodb.server.BoundMessageDispatcher.dispatch(BoundMessageDispatcher.java:28)
	at com.nuodb.host.requests.RequestDispatcher.dispatch(RequestDispatcher.java:97)
	at com.nuodb.host.requests.RequestDispatcher.dispatch(RequestDispatcher.java:34)
	at com.nuodb.server.BoundMessageDispatcher.dispatch(BoundMessageDispatcher.java:28)
	at com.nuodb.server.Server.consumeMessages(Server.java:152)
	at com.nuodb.server.Server.lambda$acceptConnections$1(Server.java:121)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at com.nuodb.util.Threading$1.lambda$wrapTarget$0(Threading.java:29)
	at java.base/java.lang.Thread.run(Thread.java:829)

Context
NuoDB Version: any
Helm Charts Version: any
Kubernetes Version: 1.15+
Environment: any

To Reproduce
Steps to reproduce the behavior:

  1. Install the NuoDB admin
  2. install the NuoDB database
  3. restart one or more admin pods

Expected behavior
The engine reconnects successfully.

Additional context
Add any other context about the problem here.

External Reference
ZenDesk/JIRA numbers if available
JIRA: DB-33760

Longer readiness timeout in admin chart

Change readinessTimeoutSeconds to 5s (currently is 1) to match same value in database charts.

Teaching Temenos today we had exactly this issue. Extending the timeout fixed it.

Example commands are using Helm 2 syntax

Describe the bug

All the examples are using helm 2 syntax.

The helm 3 syntax is helm install [NAME] [CHART] [flags]. Your examples are all of the form helm install [CHART] ... and use --name to specify the installation name (which is helm 2 syntax).

Context
NuoDB Version: any 4.x
Helm Charts Version: all - I have no idea when (or if) these charts are supposed to target helm 3.
Kubernetes Version: any
Environment: any

To Reproduce
Just look at any of the README.md documents. In particular the README.md in the root of project and the README.md in each chart. Several charts have multiple example commands.

Expected behavior
Either explicitly mention that you are using helm 2 syntax, or upgrade to helm 3 syntax or show both. Your call.

External Reference
Is a Jira required?

Database-wide options are missing in the database chart

  ## database-wide options.
  # These are applied using the --database-options on the startup command
  # change these to values appropriate for this database
  # these options are applied to all processes in the database.
  options:
    ping-timeout: 60
    max-lost-archives: 0

At startup, "job-lb-policy-nearest-*" pods sometimes fail the first few tries

K8s v1.15.5 running in minikube.

tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$ kc get pods 
NAME                                   READY   STATUS      RESTARTS   AGE
example-admin-nuodb-cluster0-0         1/1     Running     0          98s
job-lb-policy-nearest-6wlwc            0/1     Error       0          73s
job-lb-policy-nearest-rgnjx            0/1     Error       0          98s
job-lb-policy-nearest-xxsl9            0/1     Completed   0          63s
nuodb-helm-operator-7cf7c858d5-462mj   1/1     Running     0          2m58s
tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$ 

One of the failed tries:

tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$ kc logs job-lb-policy-nearest-rgnjx
Unable to connect to https://nuodb.nuodb.svc:8888: HTTPSConnectionPool(host='nuodb.nuodb.svc', port=8888): Max retries exceeded with url: /api/1/databases/loadBalancerPolicy/nearest (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7efd2d1ad410>: Failed to establish a new connection: [Errno -2] Name or service not known',))
tgates@tgu19:~/nuodb/nuodb-helm-operator/deploy$

Maybe there is some kind of race condition before the service is created?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.