Comments (6)
@srinathganesh1 this is probably not related to neo4j-helm, but has to do with the way you generate load, and what your client connection strategy is to Neo4j.
What you describe makes it sound like whatever is generating the load is sending all queries to the Neo4j leader. If the load is all writes, that's going to happen no matter what you do - because only the leader can service writes. If it's a mixture of reads and writes, then the reads should be getting spread out to the other cluster members.
In your client code or whatever's generating the load, pay particular attention to whether you're using "autocommit transactions" or explicit read/write transactions. Autocommit transactions will generally always go to the leader and will cause what you're describing.
I recommend this article to understand what's happening: https://medium.com/neo4j/querying-neo4j-clusters-7d6fde75b5b4
from neo4j-helm.
Hi @moxious , I have updated the original post with a snipped of my code (its a python code that does many read queries in parallel).
Next to rule out any issues with my code and I used another Scipt to load test, its written below
Observations of kubectl top pods
after running this Load Testing Script https://github.com/moxious/graph-workload . Do suggest if there is anyother load testing script that I need to use
node src/run-workload.js -a SERVICE_NAME -u neo4j -p PASSWORD --ms 5 --query "match (a)-[*1..3]-(b) return count(a);" --concurrency 10000
kubectl top pods
# Random Workload 1
core-0 288m 546Mi
core-1 17m 550Mi
core-2 321m 549Mi
# Random Workload 2
core-0 274m 556Mi
core-1 433m 559Mi
core-2 635m 560Mi
# Random Workload 3
core-0 308m 562Mi
core-1 257m 562Mi
core-2 10m 560Mi
# Random Workload 4
core-0 153m 559Mi
core-1 113m 562Mi
core-2 96m 561Mi
# Start of Big Workload
core-0 388m 571Mi
core-1 381m 571Mi
core-2 11m 569Mi
# During Big Workload
core-0 1198m 572Mi
core-1 1200m 572Mi
core-2 1198m 582Mi
from neo4j-helm.
There's a conceptual problem here. The graph workload tool is fine, but when you use --query, the tool doesn't know if you're doing reads or writes. So it chooses to do a writeTransaction for you even if it's a read (it doesn't parse cypher). This in turn means all of your queries get routed to the leader.
If you like, you can open an issue on that workload tool repo and I'll fix it when I can. You need the ability to pass a "mode" flag like this:
node src/run-workload.js -a SERVICE_NAME -u neo4j -p PASSWORD --ms 5 --query "match (a)-[*1..3]-(b) return count(a);" --concurrency 10000 --mode READ
This would tell the tool to run read transactions, which would spread them out across your cluster and more evenly utilize CPU. Right now I think you're just beating up the leader.
So you have a couple of options to have tight control over what you want:
- Open a ticket, we can add a mode flag, and then you can do what you're doing with one extra flag when I can get that change in.
- Implement your own "Strategy" in that workload tool to specify exactly what you want run in your scenario (there are lots of examples in the repo)
- Write your own small client app that simulates exactly the workflow you want.
from neo4j-helm.
I did try out a Python based test where READ/WRITE mode is set to queries
Sample Code
def get_driver():
"""
Initialize a Driver and return it
"""
return GraphDatabase.driver(
f"bolt+routing://{NEO4J_HOST}:{NEO4J_BOLT_PORT}",
auth=basic_auth(NEO4J_USERNAME, NEO4J_PASSWORD),
)
DRIVER = get_driver()
@staticmethod
def _execute_query(tx, query):
result = tx.run(query)
return result.data()
@staticmethod
def do_read(query):
with DRIVER.session() as session:
result = session.read_transaction(ExecuteQuery._execute_query, query)
return result
and with this code too I am facing similar imbalance of loads too.
I will try out the changes in your reply too
from neo4j-helm.
@srinathganesh1 v0.5.1 of graph-workload is now available that has a --mode flag. If you do what you were doing but include --mode READ with the latest code, it should distribute reads across all of your followers.
https://github.com/moxious/graph-workload/releases/tag/v0.5.1
I'm going to close this for now as I'm pretty sure this issue is unrelated to helm and kubernetes. But I really recommend you read this article I linked to understand what's happening & why: https://medium.com/neo4j/querying-neo4j-clusters-7d6fde75b5b4
Keep in mind - in a 3 node causal cluster, if you send thousands of reads to your cluster, they will typically be distributed amongst the 2 followers. If you send thousands of writes to your cluster, they'll all go to the leader. This means that if you truly want to balance the CPU of all 3 machines in the cluster, you need a mixed read/write workload, which you can generate by running the workload tool twice concurrently
from neo4j-helm.
Keep in mind - in a 3 node causal cluster, if you send thousands of reads to your cluster, they will typically be distributed amongst the 2 followers
ok thank you
from neo4j-helm.
Related Issues (20)
- Unable to set apoc.trigger.enabled in neo4j.conf HOT 1
- When using an older version of Neo4J (3.5) the latest changes in 4.3.2-1 chart cause startup errors
- Prometheus Endpoint fails
- Readiness probe prevents recovery
- Cannot create manual jobs from CronJob
- Specify a specific version of plugin to install HOT 1
- Add support for priority classes
- 3.5.30 - `neo4j-apoc-procedures/verison.json` for 3.5.30 not updated HOT 1
- Different nodeSelector, affinity and tolerations for core and readReplica
- Can't set toleration on the backup pods
- Can't change log format
- Issue when setting apoc export or import configurations HOT 5
- Please do not deprecate this chart!
- log4j version 2.17
- Neo4j 4.2.15 is not starting with helm version 4.2.14 HOT 2
- Entrypoint script has moved in 3.5.31 - no longer boots using Helm chart HOT 1
- Neo4j backup fails with azure-cli version 2.34.1
- Neo4j Installation Stuck due apoc plugin downloading failed.
- Wrong name with 4.4.10 release (neo4j-4.4.10.1.tgz)
- Deprecated API versions for Cronjob and PodDisruptionBudget
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from neo4j-helm.