neo4j-contrib / neo4j-helm Goto Github PK

View Code? Open in Web Editor NEW

88.0 9.0 84.0 720 KB

Helm Charts for running Neo4j on Kubernetes [DEPRECATED]

Home Page: https://neo4j-contrib.github.io/neo4j-helm/user-guide/USER-GUIDE.html

License: Apache License 2.0

Shell 72.44% Dockerfile 15.45% Makefile 2.87% Mustache 9.24%

helm kubernetes neo4j cypher graph

neo4j-helm's Introduction

DEPRECATED

From version 4.3, Neo4j publishes a productized and supported version of Helm Charts, which have been written using the experience generated by the neo4j-helm Labs project. The new repository is here: https://github.com/neo4j/helm-charts.

For Neo4j standalone, productized Helm charts are available for Neo4j 4.3 and above.
For Neo4j Causal Cluster, productized Helm charts are available for Neo4j 4.4 and above.

That is the recommended way to run Neo4j in Kubernetes.
Full details are in the Kubernetes section of the Neo4j operations manual

The neo4j-helm Labs Helm charts described here will keep being updated for 4.4.x , but updates will stop from the next major release 5.0

Neo4j-Helm

This repository contains a Helm chart that starts Neo4j >= 4.0 Enterprise Edition clusters in Kubernetes.

Full Documentation can be found here

Quick Start

Check the releases page and copy the URL of the tgz package. Make sure to note the correct version of Neo4j.

Standalone (single server)

$ helm install mygraph RELEASE_URL --set core.standalone=true --set acceptLicenseAgreement=yes --set neo4jPassword=mySecretPassword

Casual Cluster (3 core, 0 read replicas)

$ helm install mygraph RELEASE_URL --set acceptLicenseAgreement=yes --set neo4jPassword=mySecretPassword

When you're done: helm uninstall mygraph.

Documentation

The User Guide contains all the documentation for this helm chart.

The Neo4j Community Site is a great place to go for discussion and questions about Neo4j & Kubernetes.

Additional instructions, general documentation, and operational facets are covered in the following articles:

Architectural Documentation describing how the helm chart is put together
External exposure of Neo4j clusters on Kubernetes - how to use tools like Neo4j Browser and cypher-shell from clients originating outside of Kubernetes
Neo4j Considerations in Orchestration Environments which covers how the smart-client routing protocol that Neo4j uses interacts with Kubernetes networking. Make sure to read this if you are trying to expose the Neo4j database outside of Kubernetes
How to Backup Neo4j Running in Kubernetes
How to Restore Neo4j Backups on Kubernetes

Helm Testing

This chart contains a standard set of helm chart tests, which can be run after a deploy is ready, like this:

helm test mygraph

Local Testing & Development

Template Expansion

To see what helm will actually deploy based on the templates:

helm template --name-template tester --set acceptLicenseAgreement=yes --set neo4jPassword=mySecretPassword . > expanded.yaml

Full-Cycle Test

The following mini-script will provision a test cluster, monitor it for rollout, test it, report test results, and teardown / destroy PVCs.

Provision K8S Cluster

Please use the tools/test/provision-k8s.sh, and customize your Google Cloud project ID.

Standalone

Standalone forms faster so we can manually lower the liveness/readiness timeouts.

export NAME=a
export NAMESPACE=default
helm install $NAME . -f deployment-scenarios/ci/standalone.yaml && \
kubectl rollout status --namespace $NAMESPACE StatefulSet/$NAME-neo4j-core --watch && \
helm test $NAME --logs | tee testlog.txt
helm uninstall $NAME
sleep 20
for idx in 0 1 2 ; do
  kubectl delete pvc datadir-$NAME-neo4j-core-$idx ;
done

Causal Cluster

export NAME=a
export NAMESPACE=default
helm install $NAME . -f deployment-scenarios/ci/cluster.yaml && \
kubectl rollout status --namespace $NAMESPACE StatefulSet/$NAME-neo4j-core --watch && \
helm test $NAME --logs | tee testlog.txt
helm uninstall $NAME
sleep 20
for idx in 0 1 2 ; do
  kubectl delete pvc datadir-$NAME-neo4j-core-$idx ;
done

Internal Tooling

This repo contains internal tooling containers for backup, restore, and test of the helm chart.

Building the Containers

If you want to push your own docker containers, make sure that the registry in the Makefile is set to somewhere you have permissions on.

cd tools
make docker_build
make docker_push

neo4j-helm's People

Contributors

Stargazers

Watchers

Forkers

johnjjung schroedermatt aitorhh sdaschner ben-harack junwen29 caryyu bbenzikry mvanderheijden su7970 rg2609 bugzyz diegoep arthur-rock rubedolife gwvandesteeg zhangchao bigdatavik adam-cowley edroot eastlondoner voutilad mats-sx achantavy akarasavov kimels jtyr davidrogola peter-c-larsson mcooknu jsotelo narenlogi secprez dr-tony-lin t83714 ikwattro charleshuangcai injectives heyitsmemark srfaytkn srnal-zz npapapietro sandstonejaguar mmckane yinlei168 laeg davidlrosenblum mohammadsrahman alemairebe zhuosichen chowdeswar777 kutysam sam-som dullaertd pathanshakil8 ramaiahgarivenkaiah gaosong030431207 dvillaj reference-project jchapa sherincheriyan alxndr13 techiebora qtngr1 samayar-tw-beaute bokjo sritejahipaas recrwplay msmolik566 omarm8x gregoryg jayanthsagar alpfr developeration drraghavendra atrax1 prasanthvm90 minimax75 thinkcodee vadacom

neo4j-helm's Issues

minor: NOTES.txt no space between "secrets" and secret name

The generates output after a helm chart install does not put a space or forward slash between the kubectl get secrets and the secret name resulting in the two strings getting squashed together.

As per

export NEO4J_PASSWORD=$(kubectl get secretsneo4j-single-neo4j-secrets --namespace default -o yaml | grep password | sed 's/.*: //' | base64 -d)

PVCs names should be defined as variables for NEO4J chart

We need PVCs names to be provided as variables. Current chart "assumes" specific naming convention which is not practical for prod implementation.

We can do custom modification but that defeat the purpose of using vendor supported implementation

Thanks

Inject arbitrary secrets into the environment

Add support for injecting secrets into the environment. Just like there is now for configmap in core.configMap.

Use-case I have where this would be needed:
I have a custom plugin that exposes a procedure, that when called puts a message into a rabbitMQ queue. This procedure is called in an apoc trigger for different events. The rabbitMQ credentials is pulled from the environment.
Right now these credentials will need to be put into a configmap in order to support the use-case, but as the credentials are sensitive, a way to take them from a secret would be better than configmap.

Provide single core resources

For simpler setups it might already be helpful to deploy a standalone, single core instance of Neo4J, but with the full manageability, i.e. persistent volumes, backup jobs, etc.

I've forked the chart and removed all services and discovery settings that are not required in single mode into the following branch: https://github.com/sdaschner/neo4j-helm/tree/single-instance

I also wrote the following post that aims to explain what's required step-by-step: https://blog.sebastian-daschner.com/entries/neo4j-single-core-managed-k8s

WDYT? Happy to provide a PR to a seperate branch / directory or else, if you think that might be helpful for more folks (I've used that example in a real-world project, btw.).

Mirror Neo4j Operations Manual in Documentation

The following is the table of Contents of the Neo4j Operations Manual. In order for the helm template to be operable for people on Kubernetes, some k8s-specific translations or considerations are going to be necessary for each section.

This ticket is suggesting a revamp of the existing user guide to parallel the structure of the operations manual, omitting sections or just referring back to it where the information is the same.

For example, instructions on how to use via docker is not relevant for the helm chart because you're already there. The clustering section does not need to be repeated because everything in the ops manual applies here. But the backup, restore, and upgrade sections need particular attention.

Chapter 1, Introduction — Introduction of Neo4j Community and Enterprise Editions.
Chapter 2, Installation — Instructions on how to install Neo4j in different deployment contexts.
Chapter 3, Cloud deployments — Information on how to deploy Neo4j on cloud platforms.
Chapter 4, Docker — Instructions on how to use Neo4j on Docker.
Chapter 5, Configuration — Instructions on how to configure certain parts of Neo4j.
Chapter 6, Manage databases — Instructions on how to manage multiple active databases with Neo4j.
Chapter 7, Clustering — Comprehensive descriptions of Neo4j Causal Clustering.
Chapter 8, Fabric — Instructions on how to configure and use Neo4j Fabric.
Chapter 9, Upgrade — Instructions on upgrading Neo4j.
Chapter 10, Backup — Instructions on setting up Neo4j backups.
Chapter 11, Authentication and authorization — Instructions on user management and role-based access control.
Chapter 12, Security — Instructions on server security.
Chapter 13, Monitoring — Instructions on setting up Neo4j monitoring.
Chapter 14, Performance — Instructions on how to go about performance tuning for Neo4j.
Chapter 15, Tools — Description of Neo4j tools.

Backup methods

Can velero(pvc) be used instead of neo4j backup? Does it pose any risk of data corruption?

Unable to add custom pod/deployment/statefullset labels or annotations

There is no functionality in the chart to add your own custom annotations/labels at either the core or replica sets, in either the pods, the deployments, or the statefulset. However they can be added to the service definitions, so they appear in some parts, but not all.

Add ability to specify:

annotations
labels

To the:

pods
deployments
replicaset/statefullset

Chartify the external exposure instructions

Right now it's a mixture of yaml and markdown and manual instructions. Better would be to wrap it up into its own mini-helm chart so that you could do for example:

helm install neo4j-exposure --set ip0=whatever --set deployment=my-graph
helm install my-graph --(cluster settings)

Not able to authenticate after changing password

Hey Guys, I don't know If I am doing something wrong!

I am installing the chart in a Kubernetes Cluster using the following command:

helm install mygraph RELEASE_URL --set core.standalone=true --set acceptLicenseAgreement=yes --set neo4jPassword=mySecretPassword

The deploy goes well and I can see the following message:

Changed password for user 'neo4j'.
Remote interface available at http://graphdb-neo4j-core-0.graphdb-neo4j.default.svc.cluster.local:7474

But then when we try to authenticate to db in our node.js service using user=neo4j and password=mySecretPassword we see the following message being returning by db:

Failed authentication attempt for 'neo4j' from 100.101.66.28

Also I have also trying doing a curl inside the neo4j container:

curl  http://neo4j:mySecretPassword@graphdb-neo4j-core-0.graphdb-neo4j.default.svc.cluster.local:7474/user/neo4j
{
  "errors" : [ {
    "code" : "Neo.ClientError.Security.Unauthorized",
    "message" : "Invalid username or password."
  } ]

Am I doing something wrong ?

PS. When I set authEnabled: false in the chart, everything works as expected :)

Set the terminationGracePeriodSeconds to allow for Neo4j checkpointing

Neo4j does checkpointing on shutdown and it can take a bit of time. It's best to allow it to complete though.

Set terminationGracePeriodSeconds to a nice high value (300 seconds) in the helm chart to allow time for Neo4j to do checkpointing.

Get the namespace at runtime for the core init scrips

Currently we use the namespace at template-creation time to modify the startup script of the core statefulsets and set the discovery advertised address.

You can get the namespace at runtime using cat /var/run/secrets/kubernetes.io/serviceaccount/namespace

This line in the core startup command could be changed to make use of this:

export DISCOVERY_HOST="discovery-{{ template "neo4j.fullname" . }}-${core_idx}.{{ .Release.Namespace }}.svc.{{ .Values.clusterDomain }}"

Missing `fullNameOverride` common helm chart functionality

Almost all commonly used helm charts have the ability to use a fullNameOverride value in the values.yaml which allows the end user to override the name of the deployed components to be able to deploy them without the helm Release value in their name (so you don't end up with a neo4j-neo4j name of the deployment when all you want is a neo4j entry).

Request:

Expand the naming in the _helpers.tpl to be able to use this value.

Additional Volume values - map or list

In values.yaml we have

core:
  additionalVolumes: []
  additionalVolumeMounts: []

readReplica:
  additionalVolumes: {}
  additionalVolumeMounts: {}

can we be consistent about whether we use a list or a dict ?
I suggest changing the readReplicas to list

Post notice of deprecation for the stable/neo4j helm chart

This repo is intended to be the neo4j self-hosted version going forward, so the other should point here

podAnnnotations with Boolean in Map do not work

I am trying to add in an annotation as follows:

neo4j:
    podAnnotations:
        - sidecar.istio.io/inject: "false"

But the Go templates are complaining the formatting is incorrect.

See line below for contributing failure.

neo4j-helm/values.yaml

Line 34 in cce71f3

podAnnotations: {}

Neo4j pod is not able to deploy on other node after hosting node is down

Hi neo4j experts,
We are running HA test against neo4j installed by latest helm chart https://github.com/neo4j-contrib/neo4j-helm/releases/download/4.1.0-2/neo4j-4.1.0-2.tgz
There is a case to shutdown the hosted node and check whether the cluster is able to scale out in other healthy node.

After the node was shutdown, the pod stuck with "Terminating" and there is no new pod created.
I tried with other helm chart and they were able to create new pod which means this k8s is working correctly.

The k8s-mk-3 was shutdown and you can see the neo4j pods are stuck
root@k8s-master-mk:~# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-neo4j-core-0 1/1 Terminating 0 44m 10.42.0.1 k8s-mk-3
algotest-neo4j-core-1 1/1 Running 0 44m 10.47.128.12 k8s-mk-2
test-neo4j-core-2 1/1 Terminating 0 44m 10.42.0.2 k8s-mk-3
test-neo4j-replica-0 1/1 Running 0 44m 10.47.128.13 k8s-mk-2
test-neo4j-replica-1 1/1 Terminating 0 44m 10.42.0.3 k8s-mk-3

For the prometheus, you can see even though the pod on k8s-mk-3 was stuck, there was a new pod running on other pod
root@k8s-master-mk:~# kubectl get pods -o wide | grep prometheus
prometheus-alertmanager-6cc564fddc-ghv57 2/2 Terminating 0 48m 10.42.0.5 k8s-mk-3
prometheus-alertmanager-6cc564fddc-tz9kx 2/2 Running 0 20m 10.46.0.15 k8s-mk-1
prometheus-kube-state-metrics-7b4b4b7b7f-zfvfz 1/1 Running 0 48m 10.47.128.14 k8s-mk-2
prometheus-node-exporter-2vwvc 1/1 Running 0 48m 10.176.20.193 k8s-mk-2
prometheus-node-exporter-8w652 1/1 Running 0 48m 10.176.21.120 k8s-mk-3
prometheus-node-exporter-gn4fz 1/1 Running 0 48m 10.176.20.192 k8s-mk-1
prometheus-pushgateway-7b644975cb-n8p5g 1/1 Terminating 0 48m 10.42.0.4 k8s-mk-3
prometheus-pushgateway-7b644975cb-vf4z4 1/1 Running 0 20m 10.46.0.14 k8s-mk-1
prometheus-server-56784fbdcb-48mz8 2/2 Running 0 20m 10.47.128.15 k8s-mk-2
prometheus-server-56784fbdcb-4zwvf 2/2 Terminating 0 48m 10.42.0.6 k8s-mk-3

This same result applies to grafana
root@k8s-master-mk:# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
g-grafana-57bbbdbf77-5gn5g 1/1 Terminating 0 10m 10.42.0.14 k8s-mk-3
g-grafana-57bbbdbf77-znkk9 1/1 Running 0 3m42s 10.46.0.3 k8s-mk-1

Could you help to look into this issue?

neither Quick Start nor other ways with neo4j >= 4 are working

When using the Quick Start Guide like this:

helm install mygraph https://github.com/neo4j-contrib/neo4j-helm/releases/download/4.1.0-2/neo4j-4.1.0-2.tgz --set core.standalone=true --set acceptLicenseAgreement=yes --set neo4jPassword=mySecretPassword

the container is starting but remains in CrashLoopBackoff with the following Error Message:

Configuration override prefix = mygraph_neo4j_core_0
Starting Neo4j CORE 0 on mygraph-neo4j-core-0.mygraph-neo4j.mw-internal.svc.cluster.local
Changed password for user 'neo4j'.
APOC couln't set a URLStreamHandlerFactory since some other tool already did this (e.g. tomcat). This means you cannot use s3:// or hdfs:// style URLs in APOC. This is caused by a limitation of the JVM which we cannot fix.
Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json
Installing Plugin 'apoc' from https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/4.1.0.0/apoc-4.1.0.0-all.jar to /plugins/apoc.jar
Applying default values for plugin apoc to neo4j.conf
/var/lib/neo4j/bin/neo4j: line 183: /var/lib/neo4j/conf/neo4j.conf: Permission denied

disabling apoc results only in the last part of the error log:

Changed password for user 'neo4j'.
/var/lib/neo4j/bin/neo4j: line 183: /var/lib/neo4j/conf/neo4j.conf: Permission denied

The only way I could fix this isssue was by changing the ImageTag like this:
helm install mygraph https://github.com/neo4j-contrib/neo4j-helm/releases/download/4.1.0-2/neo4j-4.1.0-2.tgz --set imageTag="3.4.5" --set core.standalone=true --set acceptLicenseAgreement=yes --set neo4jPassword=mySecretPassword

I think the repository is supposed to support neo4j >= 4, so the 3.4.5 Image should not work but the neo4j 4 versions should work and the neo4j 3 Version already worked with the older helm chart.
It seems like the initial neo4j.conf changes are done by another user with more permissions than the user that starts the database

Unable to specify additional volume mounts

The chart currently does not allow the end user to specify additional volume mounts for the pods, this would provide a means of easily injecting files such as apoc.conf or initialisation scripts into the imports directory so they can be used by the configuration option.

apoc.initializer.cypher=CALL apoc.cypher.runSchemaFile("file:///indexes.cypher")

Cluster cannot connect to extra core, replica after scaling up

Hi!

We used command kubectl scale statefulsets testzyz2-neo4j-core --replicas=5 to scale the original cluster with 3 cores to 5 cores.

After scaling, the new cores' pod is running but not ready. We saw this log in old core's logs/debug.log:"failed because of java.net.UnknownHostException: discovery-my-neo4j-neo4j-3.default.svc.cluster.local".

It seems that the extra cores created by the scale commands didn’t have a corresponding discovery service(ClusterIP). So our colleague just manually wrote the yaml file and create the discovery service and then it works.

Did we make a mistake on the configuration?

Set default log location to store on the PVC for /data

By default, logs go to ephemeral storage, and this is undesirable for many cases, so instead set config so that logs get piped to /data/logs by default

Support for Istio Service Mesh

I am currently implementing Neo4J as part of our Service Mesh. Upon analysis there are many ports which do not follow the correct naming conventions. See https://istio.io/latest/docs/reference/config/analysis/ist0118/

Running istioctl analyze -n my-namespace produces the following information.

Info [IST0118] (Service discovery-my-namespace-neo4j-0.my-namespace) Port name discovery (port: 5000, targetPort: 5000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-0.my-namespace) Port name jmx (port: 3637, targetPort: 3637) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-0.my-namespace) Port name prometheus (port: 2004, targetPort: 2004) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-0.my-namespace) Port name raft (port: 7000, targetPort: 7000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-0.my-namespace) Port name transaction (port: 6000, targetPort: 6000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-1.my-namespace) Port name discovery (port: 5000, targetPort: 5000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-1.my-namespace) Port name jmx (port: 3637, targetPort: 3637) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-1.my-namespace) Port name prometheus (port: 2004, targetPort: 2004) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-1.my-namespace) Port name raft (port: 7000, targetPort: 7000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-1.my-namespace) Port name transaction (port: 6000, targetPort: 6000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-2.my-namespace) Port name discovery (port: 5000, targetPort: 5000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-2.my-namespace) Port name jmx (port: 3637, targetPort: 3637) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-2.my-namespace) Port name prometheus (port: 2004, targetPort: 2004) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-2.my-namespace) Port name raft (port: 7000, targetPort: 7000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-2.my-namespace) Port name transaction (port: 6000, targetPort: 6000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-3.my-namespace) Port name discovery (port: 5000, targetPort: 5000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-3.my-namespace) Port name jmx (port: 3637, targetPort: 3637) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-3.my-namespace) Port name prometheus (port: 2004, targetPort: 2004) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-3.my-namespace) Port name raft (port: 7000, targetPort: 7000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-3.my-namespace) Port name transaction (port: 6000, targetPort: 6000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-4.my-namespace) Port name discovery (port: 5000, targetPort: 5000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-4.my-namespace) Port name jmx (port: 3637, targetPort: 3637) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-4.my-namespace) Port name prometheus (port: 2004, targetPort: 2004) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-4.my-namespace) Port name raft (port: 7000, targetPort: 7000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-4.my-namespace) Port name transaction (port: 6000, targetPort: 6000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-5.my-namespace) Port name discovery (port: 5000, targetPort: 5000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-5.my-namespace) Port name jmx (port: 3637, targetPort: 3637) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-5.my-namespace) Port name prometheus (port: 2004, targetPort: 2004) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-5.my-namespace) Port name raft (port: 7000, targetPort: 7000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service discovery-my-namespace-neo4j-5.my-namespace) Port name transaction (port: 6000, targetPort: 6000) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service my-namespace-neo4j.my-namespace) Port name backup (port: 6362, targetPort: 6362) doesn't follow the naming convention of Istio port.
  Info [IST0118] (Service my-namespace-neo4j.local-preview) Port name bolt (port: 7687, targetPort: 7687) doesn't follow the naming convention of Istio port.

How to fix

If you know the protocol the service port is serving, renaming the port with [-] format;

For example:
Before

  ports:
    - name:prometheus
      port: 2004
      targetPort: 2004

After

  ports:
    - name: tcp-prometheus
      port: 2004
      targetPort: 2004

Use `Local` externalTrafficPolicy in external access example

using "Cluster" gives better load balancing at the expense of extra network hops. Since we route directly to individual cores there is no reason to try and get better load balancing - we should use "Local" to reduce latency.

service.spec.externalTrafficPolicy - denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. There are two available options: Cluster (default) and Local. Cluster obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading. Local preserves the client source IP and avoids a second hop for LoadBalancer and NodePort type services, but risks potentially imbalanced traffic spreading.

No jar URL found for version '4.1.0' - GDS on 4.1

Hi - first, thanks for the work on this chart :)
I wasn't sure whether to open this in the GDS repository or not, but since the chart's default tag is 4.1 it seems appropriate.
GDS versions.json doesn't seem to have any reference for 4.1 and so our pod logs show

Error: No jar URL found for version '4.1.0' in versions.json from 'https://s3-eu-west-1.amazonaws.com/com.neo4j.graphalgorithms.dist/graph-data-science/versions.json'

Our chart is a fork which adds plugins via NEO4JLABS_PLUGINS

- {{- if .Values.useAPOC }}
NEO4JLABS_PLUGINS: "[\"apoc\",\"graph-data-science\", \"n10s\"]"
- {{- end }}

Not distributing resource load equally with each replica

I have a Causal Cluster of 3 instances, when I send a bunch of massive queries for load testing.

When I do kubectl top pods I can see the CPU is peaking for ONE instance, which rest of the instances are not taking much load. Later on one more Pod receives CPU load.

in summary, each pod is not getting equal CPU load. Is this is known way? is there any way to config it?

Type of Queries: ALL READ QUEIRES

Code Snippet:

def get_driver():
    """
    Initialize a Driver and return it
    """
    return GraphDatabase.driver(
            f"bolt+routing://{NEO4J_HOST}:{NEO4J_BOLT_PORT}",
            auth=basic_auth(NEO4J_USERNAME, NEO4J_PASSWORD),
    )

DRIVER = get_driver()

@staticmethod
    def _execute_query(tx, query):
        result = tx.run(query)
        return result.data()

@staticmethod
    def do_read(query):
        with DRIVER.session() as session:
            result = session.read_transaction(ExecuteQuery._execute_query, query)
        return result

Then I invoke do_read(query) with multiple parallel connections (via Python Celery)

Version: Latest Helm Version

Add optional prometheus / metrics exposure support

Inspired by: https://github.com/helm/charts/pull/21434/files

Error: No jar URL found for version '4.1.1' in versions.json

After install Helm chart, I get the message:
helm install neo4jgraph https://github.com/neo4j-contrib/neo4j-helm/releases/download/4.1.1-2/neo4j-4.1.1-2.tgz --set core.standalone=true --set acceptLicenseAgreement=yes --set neo4jPassword=developer

│ Configuration override prefix = neo4jgraph_neo4j_core_0                                                                                                                   │
│ Starting Neo4j CORE 0 on neo4jgraph-neo4j-core-0.neo4jgraph-neo4j.default.svc.cluster.local                                                                               │
│ Warning: Some files inside "/data" are not writable from inside container. Changing folder owner to neo4j.                                                                │
│ Changed password for user 'neo4j'.                                                                                                                                        │
│ Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json                                                         │
│ Error: No jar URL found for version '4.1.1' in versions.json from 'https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json'                                   │
│ stream closed

Backup/Restore to/from AWS S3 bucket

Provide backup/restore helm chart for AWS S3 bucket

Enable users to set static NodePort

Hey there,

thank you for this terrific chart! I hope you can help me with this;

I just wanted to deploy a standalone Neo4j instance with it and stumbled upon the fact that one can set core.service.type to NodePort, but there is no option to set a specific port when doing so.

I'd create a PR, but I'm not quite sure how to handle the multiple ports declared in https://github.com/neo4j-contrib/neo4j-helm/blob/master/templates/core-dns.yaml yet.

An example for a parameterized NodePort can be found in the Sentry Helm chart: https://github.com/sentry-kubernetes/charts/blob/develop/sentry/templates/service-sentry.yaml#L25

Error: No jar URL found for version '4.0.5' in versions.json

I run

helm install -n mygraph https://github.com/neo4j-contrib/neo4j-helm/releases/download/4.0.5-1/neo4j-4.0.5-1.tgz --set acceptLicenseAgreement=yes --set neo4jPassword=nanana

And I get this error:

Configuration override prefix = mygraph_neo4j_core_2
Starting Neo4j CORE 2 on mygraph-neo4j-core-2.mygraph-neo4j.neo4j.svc.cluster.local
Changed password for user 'neo4j'.
Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json
Error: No jar URL found for version '4.0.5' in versions.json from 'https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json'

allow users to set their own `terminationGracePeriodSeconds`

It's great that we default terminationGracePeriodSeconds: 300 as a good baseline for allowing checkpointing etc. to happen.

However users with large stores might need to allow more than 5 minutes - it would be useful to make this a configurable variable in the helm chart.

Add environment variables to the values.yaml file

Hey,
I have a custom plugin that I would like to run. This plugin pulls 3 variables from environment, but I can't seem to find a way to add environment variables to the helm values.yaml file. Am I missing something or is this not supported?

Add support for nginx-ingress

This issue has been outlined fairly well here: https://community.neo4j.com/t/cannot-connect-to-cluster-using-k8s-ingress/15476/7

The intent is to be able to expose the services via an nginx-ingress wherein connections to bolt can be made from outside the cluster. The current external access documentation here: https://github.com/neo4j-contrib/neo4j-helm/blob/master/tools/external-exposure/EXTERNAL-EXPOSURE.md has a few short-comings:

Additional static IP address are required instead of using an existing load balancer that nginx-ingress manages
Changing A records for the static IPs could become a nuisance in environments where resources are created and destroyed frequently
In my particular case, I am using an internal ELB with nginx-ingress. AWS does not allow creating private static IPs without the use of an ENI AFAIK. This would make this solution only possible with public IP addresses

Allow overriding the label selector used by neo4j's k8s discovery method (for advanced users)

This would allow us to tell groups of cores created by different helm charts to communicate with each other as a single neo4j cluster.

Issue with svc.cluster.local

Hello there,

I have a question on this postfix svc.cluster.local. can we remove this only keep $workloadName.$namespace ?

I am unable to remove svc.cluster.local somehow even I setup the NEO4J_causal__clustering_initial__discovery__members and
__clustering_discovery__advertised__address to NEO4J_causal__clustering_discovery__advertised__address:-$HOST:5000

The $HOST will be export HOST=${HOST:-$(hostname -f)}; echo $HOST will be c-neo4j-core-0.c-neo4j.oss-prod-ns.svc.xxx.xxx.io for example

Thanks for advise~

Is it possible to add the useGDS plugins configuration for values.yaml?

Hi!
We are using the apoc and gds plugins on the cluster created by your perfect helm chart!
Is it possible to add the useGDS configuration just like the useAPOC configuration in your future version?

adding additional configuration options

Hi there, thanks for this very nice helm chart, it works really well out-of-the-box!

I'm trying to configure the neosemantics plugin with neo4j - I've added it to the plugins list and it gets installed fine, but when I want to access the /rdf endpoint it's not there. This is because a line has to be added to the neo4j config file. I can't find any place in the helm chart that would allow me to do this. Is there a mechanism for this already built in that I am missing?

I could use a post-install hook of some sort but I don't see a volume dedicated to the config that I could easily point the job to. How do you envision modifications to the config to be made?

Restore failed using neo4j restore.sh script

Hi, based on my understanding of the code in restore.sh, I just want to confirm that:

Whether this is a logic mistake or I misunderstand the logic of the code
How to use this script correctly?

My understanding(maybe misunderstanding😂):

Here, I think the restore.sh already limit the source database backup should be ended up with 'tar.gz' otherwise the restore_database function will return. (https://github.com/neo4j-contrib/neo4j-helm/blob/master/tools/restore/restore.sh#L70-L81)

But here, there are also some restore logic for those source database backup not ended up with 'tar.gz/gzip' (https://github.com/neo4j-contrib/neo4j-helm/blob/master/tools/restore/restore.sh#L93-L130). So it confused me wether this part of code took effect in this script.

Most important thing is that, I ran into some error when use this script. 😂
I exported several env as below:
export BUCKET=gs://bucket_name_x/dir_name_y
export DATABASE=testdb
export GOOGLE_APPLICATION_CREDENTIALS=xxx
export FORCE_OVERWRITE=true

And in the bucket: gs://bucket_name_x/dir_name_y, to meet the logic of your code, there is a tar file called testdb-latest.tar.gz which came from a database backup called testdb. I think the reason is here(the "data" in the middle of $RESTORE_FROM, https://github.com/neo4j-contrib/neo4j-helm/blob/master/tools/restore/restore.sh#L108). And below is the error message I got:

=== RESTORE testdb
You have an existing graph database at /data/databases/testdb
We will be force-overwriting any data present
Making restore directory
Copying gs://gcs_bucket_neo4j_poc_tpx/backup_in_highlimit_tared/testdb-latest.tar.gz -> /data/.restore
Copying gs://gcs_bucket_neo4j_poc_tpx/backup_in_highlimit_tared/testdb-latest.tar.gz...
- [1/1 files][ 29.1 MiB/ 29.1 MiB] 100% Done
Operation completed over 1 objects/29.1 MiB.
Backup size pre-uncompress:
30M	/data/.restore
total 29752
-rw-r--r-- 1 root root 30463507 Sep  3 10:20 testdb-latest.tar.gz
Untarring backup file
testdb/
... several line...
...
testdb/neostore.nodestore.db.labels
testdb/neostore.propertystore.db
BACKUP_SET_DIR was not specified, so I am assuming this backup set was formatted by my backup utility
BACKUP_FILENAME=testdb-latest.tar.gz
UNTARRED_BACKUP_DIR=testdb
UNZIPPED_BACKUP_DIR=
RESTORE_FROM=/data/.restore/data/testdb
Set to restore from /data/.restore/data/testdb - size on disk:
du: cannot access '/data/.restore/data/testdb': No such file or directory
Dry-run command
neo4j-admin restore --from=/data/.restore/data/testdb --database=testdb --force --verbose
Volume mounts and sizing
Filesystem      Size  Used Avail Use% Mounted on
none             97G   36G   62G  37% /
tmpfs           119G     0  119G   0% /dev
tmpfs           119G     0  119G   0% /sys/fs/cgroup
/dev/sda1        97G   36G   62G  37% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           119G     0  119G   0% /proc/acpi
tmpfs           119G     0  119G   0% /sys/firmware
tmpfs           119G     0  119G   0% /proc/scsi
Now restoring
neo4j 4.1.0
VM Name: OpenJDK 64-Bit Server VM
VM Vendor: Debian
VM Version: 11.0.6+10-post-Debian-1bpo91
JIT compiler: HotSpot 64-Bit Tiered Compilers
VM Arguments: [-XX:+UseParallelGC, -Dfile.encoding=UTF-8]
java.lang.IllegalArgumentException: Source directory does not exist [/data/.restore/data/testdb]
	at com.neo4j.restore.RestoreDatabaseCommand.execute(RestoreDatabaseCommand.java:49)
	at com.neo4j.restore.RestoreDatabaseCli.execute(RestoreDatabaseCli.java:59)
	at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:67)
	at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:33)
	at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
	at picocli.CommandLine.access$900(CommandLine.java:145)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1975)
	at picocli.CommandLine.execute(CommandLine.java:1904)
	at org.neo4j.cli.AdminTool.execute(AdminTool.java:77)
	at org.neo4j.cli.AdminTool.main(AdminTool.java:58)
Restore process failed; will not continue
Failed to restore testdb
All finished

improve service that is used to expose the cluster for consumption by clients

Add com.neo4j.<something> labels so we can get hold of it easily using the Kubernetes API & label selectors - this is the biggie.

Does it need publishNotReadyAddresses: true any more? might be better to remove it if not needed.

Readiness check failed because of extra discovery service in cluster with pod anti-affinity enabled

Hi neo4j experts,

I was trying to set up a neo4j cluster in GCP with kubernetes. The core and replica pods are up but stuck at readiness check.

Steps to reproduce issue

I am using neo4j-helm version 4.1.0-3 and below is the command to create cluster.
helm install jstest https://github.com/neo4j-contrib/neo4j-helm/releases/download/4.1.0-3/neo4j-4.1.0-3.tgz --set core.numberOfServers=3,readReplica.numberOfServers=2 --set acceptLicenseAgreement=yes --set neo4jPassword=passw0rd -f /root/juguo/values.yaml

The core pods will then stuck at readiness check. I have dump log file from one of the cores and you can see it in attachment. I believe the key exception log is:
2020-07-24 04:13:30.497+0000 ERROR [a.e.DummyClassForStringSources] Outbound message stream to [akka://cc-discovery-actor-system@discovery-jstest-neo4j-3.default.svc.cluster.local:5000] failed. Restarting it. Tcp command [Connect(discovery-jstest-neo4j-3.default.svc.cluster.local:5000,None,List(),Some(10000 milliseconds),true)] failed because of java.net.UnknownHostException: discovery-jstest-neo4j-3.default.svc.cluster.local Tcp command [Connect(discovery-jstest-neo4j-3.default.svc.cluster.local:5000,None,List(),Some(10000 milliseconds),true)] failed because of java.net.UnknownHostException: discovery-jstest-neo4j-3.default.svc.cluster.local
akka.stream.StreamTcpException: Tcp command [Connect(discovery-jstest-neo4j-3.default.svc.cluster.local:5000,None,List(),Some(10000 milliseconds),true)] failed because of java.net.UnknownHostException: discovery-jstest-neo4j-3.default.svc.cluster.local
Caused by: java.net.UnknownHostException: discovery-jstest-neo4j-3.default.svc.cluster.local
at akka.io.Dns$Resolved.addr(Dns.scala:55)
at akka.io.TcpOutgoingConnection$$anonfun$resolving$1.$anonfun$applyOrElse$2(TcpOutgoingConnection.scala:81)
at akka.io.TcpOutgoingConnection.akka$io$TcpOutgoingConnection$$reportConnectFailure(TcpOutgoingConnection.scala:50)
at akka.io.TcpOutgoingConnection$$anonfun$resolving$1.applyOrElse(TcpOutgoingConnection.scala:81)
at akka.actor.Actor.aroundReceive(Actor.scala:539)
at akka.actor.Actor.aroundReceive$(Actor.scala:537)
at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:31)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:610)
at akka.actor.ActorCell.invoke(ActorCell.scala:579)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)
at akka.dispatch.Mailbox.run(Mailbox.scala:229)
at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1407)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
core-0.log

Manual resolve

The fix in #48 will initialize 5 discovery service at cluster start up. Since we only have 3 cores, I manually deleted extra two discovery services by kubectl delete svc <svc name>. After that, the cluster status became normal after some minutes.

Could you help check why extra discovery service is blocking readiness check when pod anti-affinity is enabled? And are there any fixes to it so that we can avoid manually deleting svc?

Look into I/O Throttling

From Jake:

version: "2"

x-core-common:
  &core-common
  NEO4J_AUTH: neo4j/secret
  NEO4J_dbms_mode: CORE
  NEO4J_ACCEPT_LICENSE_AGREEMENT: "yes"
  NEO4J_causal__clustering_minimum__core__cluster__size__at__formation: 3
  NEO4J_causal__clustering_minimum__core__cluster__size__at__runtime: 3
  NEO4J_causal__clustering_initial__discovery__members: core1:5000,core2:5000,core3:5000

services:
  core1:
    container_name: core1
    image: neo4j:4.0-enterprise
    cap_add:
      - NET_ADMIN
    ports:
      - 7474:7474
    blkio_config:
      device_read_bps:
        - path: /dev/mapper/mint--vg-root
          rate: '30mb'
      device_read_iops:
        - path: /dev/mapper/mint--vg-root
          rate: '1920'
      device_write_bps:
        - path: /dev/mapper/mint--vg-root
          rate: '30mb'
      device_write_iops:
        - path: /dev/mapper/mint--vg-root
          rate: '1920'
    environment:
      <<: *core-common
      NEO4J_dbms_connector_bolt_advertised__address: core1:7687
      NEO4J_dbms_connector_http_listen__address: :7474
      NEO4J_causal__clustering_discovery__advertised__address: core1:5000

  core2:
    container_name: core2
    image: neo4j:4.0-enterprise
    cap_add:
      - NET_ADMIN
    ports:
      - 7475:7475
    blkio_config:
      device_read_bps:
        - path: /dev/mapper/mint--vg-root
          rate: '30mb'
      device_read_iops:
        - path: /dev/mapper/mint--vg-root
          rate: '1920'
      device_write_bps:
        - path: /dev/mapper/mint--vg-root
          rate: '30mb'
      device_write_iops:
        - path: /dev/mapper/mint--vg-root
          rate: '1920'
    environment:
      <<: *core-common
      NEO4J_dbms_connector_bolt_advertised__address: core2:7687
      NEO4J_dbms_connector_http_listen__address: :7475
      NEO4J_causal__clustering_discovery__advertised__address: core2:5000

  core3:
    container_name: core3
    image: neo4j:4.0-enterprise
    cap_add:
      - NET_ADMIN
    ports:
      - 7476:7476
    blkio_config:
      device_read_bps:
        - path: /dev/mapper/mint--vg-root
          rate: '30mb'
      device_read_iops:
        - path: /dev/mapper/mint--vg-root
          rate: '1920'
      device_write_bps:
        - path: /dev/mapper/mint--vg-root
          rate: '30mb'
      device_write_iops:
        - path: /dev/mapper/mint--vg-root
          rate: '1920'
    environment:
      <<: *core-common
      NEO4J_dbms_connector_bolt_advertised__address: core3:7687
      NEO4J_dbms_connector_http_listen__address: :7476
      NEO4J_causal__clustering_discovery__advertised__address: core3:5000

Reasoning for hard-coding LoadBalancer service for discovery?

Hi @moxious,

While deploying this new chart I've noticed a LoadBalancer service is used for the Core server discovery. Is there a specific reason to use this instead of eg. a ClusterIP?

Being forced to use a LoadBalancer per Core pod is quite a costly choice, at least on AWS. Also, depending on VPC setup etc, this exposes the pods to the outside world. LoadBalancers usually get deployed in public subnets, unless specified otherwise through the proper annotations.

I'm currently running a test with ClusterIP and that seems to work ok. I'd be happy to submit a PR to open the choice of service type.

When will there be an upgrade to neo4j 4.1.3?

Currently the helm char references 4.1.1 which contains some severe issues especially when running in docker. (example: neo4j/neo4j#12564 )

When can we expect an update on the helm chart?

Thanks!

DMBS memory heap and pagecache configurations are ignored

I'm trying to deploy neo4j via Helm using the latest chart version (4.1.0) but it seems the dbmsconfiguration is being just ignored.
Here is the configuration in my values.yaml:

dbms:
  memory:
    use_memrec: false
    heap:
      initial_size: "8G"
      max_size: "8G"
    pagecache:
      size: "8G"

I've checked the file conf/neo4j.conf and also did a cypher query to get the current configuration and in both there was no changes, it remains the initial neo4j configuration. Here was the Cypher query that I ran:

CALL dbms.listConfig()
YIELD name, value
WHERE name STARTS WITH 'dbms'
RETURN name, value
ORDER BY name;

I've also looked at the templates and I didn't find anything that reads this property and applies it into the config.

Link to values.yaml gives a 404

In https://neo4j-contrib.github.io/neo4j-helm/user-guide/USER-GUIDE.html the link to values.yaml gives a 404.
The hyper-link resolves to:
https://github.com/neo4j-contrib/neo4j-helm/blob/master/user-guide/values.yaml

Provide for backup/restore operations on Neo4j instances

This will adapt the existing artifacts from the GKE marketplace repo, make them easier to use, and improve documentation with some provided examples.

Expose neo4j-contrib/neo4j-helm as helm repository

I believe it would be good to add the deployment of the Charts to github pages or release page. Several options could be used:

Automatically: https://github.com/helm/chart-releaser/
Manually: "helm package . -d .deploy/; helm repo index .deploy/"

Both, can be use in a CI/CD pipeline.

Seems like the implementation of additionalVolumeMounts and additionalVolumes might be broken

Hello guys !

I have issues when using these 2 properties in values.yml, like this :


  ## specify additional volumes to mount in the core container, this can be used
  ## to specify additional storage of material or to inject files from ConfigMaps
  ## into the running container
  additionalVolumes:
  - name: restore-service-key
    secret:
      secretName: neo4j-cronjob-backup
      defaultMode: 0555
  - name: restore-script
    configMap:
      name: "neo4j-cronjob-script"
      defaultMode: 0777
  - name: "neo4j-tls"
    secret:
      secretName: "neo4j-tls"

  ## specify where the additional volumes are mounted in the core container
  additionalVolumeMounts:
  - name: "neo4j-tls"
    mountPath: "/certs/bolt/public.crt"
    subPath: "tls.crt"
    readOnly: true
  - name: "neo4j-tls"
    mountPath: "/certs/bolt/private.key"
    subPath: "tls.key"
    readOnly: true

I receive these warnings:

coalesce.go:196: warning: cannot overwrite table with non table for additionalVolumeMounts (map[])
coalesce.go:196: warning: cannot overwrite table with non table for additionalVolumes (map[])

If I replace the curly brackets by square brackets in the values.yml, everything is working as expected

sed -i 's/additionalVolumes: {}/additionalVolumes: []/g' $MYDIR/neo4j-helm/values.yaml
sed -i 's/additionalVolumeMounts: {}/additionalVolumeMounts: []/g' $MYDIR/neo4j-helm/values.yaml

Results: 

  ## specify additional volumes to mount in the core container, this can be used
  ## to specify additional storage of material or to inject files from ConfigMaps
  ## into the running container
  additionalVolumes: []

  ## specify where the additional volumes are mounted in the core container
  additionalVolumeMounts: []

We started a discussion about this here in the merge request (unfortunately it was already closed) : #54

So this issue would complement (or fix) this one #46

Thanks @moxious for hearing me out 🥇

I don't find the apoc.conf file in the container mentioned by the error.

Please help me understand what I'm missing.