The ArangoDB-cuGraph Adapter exports Graphs from ArangoDB, the multi-model database for graph & beyond, into RAPIDS cuGraph, a library of collective GPU-accelerated graph algorithms, and vice-versa.
While offering a similar API and set of graph algorithms to NetworkX, RAPIDS cuGraph library is GPU-based. Especially for large graphs, this results in a significant performance improvement of cuGraph compared to NetworkX. Please note that storing node attributes is currently not supported by cuGraph. In order to run cuGraph, an Nvidia-CUDA-enabled GPU is required.
Prerequisites: A CUDA-capable GPU
conda install -c arangodb adbcug-adapter
conda install -c rapidsai -c nvidia -c numba -c conda-forge cugraph>=21.12 cudatoolkit>=11.2
pip install git+https://github.com/arangoml/cugraph-adapter.git
import cudf
import cugraph
from arango import ArangoClient # Python-Arango driver
from adbcug_adapter import ADBCUG_Adapter
# Let's assume that the ArangoDB "fraud detection" dataset is imported to this endpoint
db = ArangoClient(hosts="http://localhost:8529").db("_system", username="root", password="")
adbcug_adapter = ADBCUG_Adapter(db)
# 1.1: ArangoDB to cuGraph via Graph name
cug_g = adbcug_adapter.arangodb_graph_to_cugraph("fraud-detection")
# 1.2: ArangoDB to cuGraph via Collection names
cug_g = adbcug_adapter.arangodb_collections_to_cugraph(
"fraud-detection",
{"account", "bank", "branch", "Class", "customer"}, # Vertex collections
{"accountHolder", "Relationship", "transaction"}, # Edge collections
)
# 2.1: cuGraph Homogeneous graph to ArangoDB
edges = [("Person/A", "Person/B"), ("Person/B", "Person/C")]
cug_g = cugraph.MultiGraph(directed=True)
cug_g.from_cudf_edgelist(cudf.DataFrame(edges, columns=["src", "dst"]), source="src", destination="dst", renumber=False)
edge_definitions = [
{
"edge_collection": "knows",
"from_vertex_collections": ["Person"],
"to_vertex_collections": ["Person"],
}
]
adb_g = adbcug_adapter.cugraph_to_arangodb("Knows", cug_g, edge_definitions) # Also try it with `keyify_nodes=True` !
# 2.2: cuGraph Heterogeneous graph to ArangoDB with ArangoDB node IDs
edges = []
for i in range(1, 101):
for j in range(1, 101):
if j % i == 0:
# Notice that the cuGraph node IDs are following ArangoDB _id formatting standards (i.e `collection_name/node_key`)
edges.append((f"numbers_j/{j}", f"numbers_i/{i}", j / i))
cug_g = cugraph.MultiGraph(directed=True)
cug_g.from_cudf_edgelist(cudf.DataFrame(edges, columns=["src", "dst", "quotient"]), source="src", destination="dst", edge_attr="quotient", renumber=False)
edge_definitions = [
{
"edge_collection": "is_divisible_by",
"from_vertex_collections": ["numbers_j"],
"to_vertex_collections": ["numbers_i"],
}
]
adb_g = adbcug_adapter.cugraph_to_arangodb("Divisibility", cug_g, edge_definitions, keyify_nodes=True)
# 2.3 cuGraph Heterogeneous graph to ArangoDB with non-ArangoDB node IDs
edges = [
('student:101', 'lecture:101'),
('student:102', 'lecture:102'),
('student:103', 'lecture:103'),
('student:103', 'student:101'),
('student:103', 'student:102'),
('teacher:101', 'lecture:101'),
('teacher:102', 'lecture:102'),
('teacher:103', 'lecture:103'),
('teacher:101', 'teacher:102'),
('teacher:102', 'teacher:103')
]
cug_g = cugraph.MultiGraph(directed=True)
cug_g.from_cudf_edgelist(cudf.DataFrame(edges, columns=["src", "dst"]), source='src', destination='dst')
### Learn how this example is handled in Colab: https://colab.research.google.com/github/arangoml/cugraph-adapter/blob/master/examples/ArangoDB_cuGraph_Adapter.ipynb#scrollTo=nuVoCZQv6oyi
Prerequisite: arangorestore
, CUDA-capable GPU
git clone https://github.com/arangoml/cugraph-adapter.git
cd cugraph-adapter
- (create virtual environment of choice)
conda install -c rapidsai -c nvidia -c numba -c conda-forge cugraph>=21.12 cudatoolkit>=11.2
conda run pip install -e .[dev]
- (create an ArangoDB instance with method of choice)
pytest --url <> --dbName <> --username <> --password <>
Note: A pytest
parameter can be omitted if the endpoint is using its default value:
def pytest_addoption(parser):
parser.addoption("--url", action="store", default="http://localhost:8529")
parser.addoption("--dbName", action="store", default="_system")
parser.addoption("--username", action="store", default="root")
parser.addoption("--password", action="store", default="")