awslabs / amazon-neptune-tools Goto Github PK

View Code? Open in Web Editor NEW

297.0 33.0 151.0 2.9 MB

Tools and utilities to enable loading data and building graph applications with Amazon Neptune.

License: Apache License 2.0

Python 14.35% Shell 0.48% Java 82.67% Jupyter Notebook 0.46% JavaScript 2.04%

amazon-neptune aws-neptune graphml

amazon-neptune-tools's Introduction

Amazon Neptune Tools

Utilities to enable loading data and building graph applications with Amazon Neptune.

Examples

You may also be interested in the Neptune Samples github repository, which includes samples and example code.

GraphML 2 CSV

This is a utility to convert graphml files into the Neptune CSV format.

Neptune Export

Exports Amazon Neptune data to CSV for Property Graph or Turtle for RDF graphs.

You can use neptune-export to export an Amazon Neptune database to the bulk load CSV format used by the Amazon Neptune bulk loader for Property Graph or Turtle for RDF graphs. Alternatively, you can supply your own queries to neptune-export and unload the results to CSV or Turtle.

Export Neptune to Elasticsearch

Backfills Elasticsearch with data from an existing Amazon Neptune database.

The Neptune Full-text Search CloudFormation templates provide a mechanism for indexing all new data that is added to an Amazon Neptune database in Elasticsearch. However, there are situations in which you may want to index existing data in a Neptune database prior to enabling the full-text search integration.

You can use this export Neptune to Elasticsearch solution to index existing data in an Amazon Neptune database in Elasticsearch.

Neo4j to Neptune

A command-line utility for migrating data to Neptune from Neo4j.

Glue Neptune

glue-neptune is a Python library for AWS Glue that helps writing data to Amazon Neptune from Glue jobs. With glue-neptune you can:

Get Neptune connection information from the Glue Data Catalog
Create label and node and edge ID columns in DynamicFrames, named in accordance with the Neptune CSV bulk load format for property graphs
Write from DynamicFrames directly to Neptune

Neptune CSV to RDF

If you're interested in converting Neptune's CSV format to RDF, see amazon-neptune-csv-to-rdf-converter.

Neptune CSV to Gremlin

csv-gremlin is a tool that can turn Amazon Neptune format CSV files into Gremlin steps allowing them to be loaded into different Apache TinkerPop compliant stores (including Amazon Neptune) using Gremlin queries. The tool also tries to validate that the CSV files do not contain errors and can be use to inspect CSV files prior to starting a bulk load.

CSV to Neptune Bulk Format CSV

csv-to-neptune-bulk-format is a utility to identify nodes and edges in the source CSV data file(s) and generate the Amazon Neptune gremlin load data format files. A configuration file (JSON) defines the source and target files, nodes/edges definition, and selection logic. The script interprets one or more configuration files and generates Amazon Neptune gremlin load data format files. The generated files can be loaded into the Neptune database.

neptune-gremlin-js

A Javascript SDK for querying Neptune with gremlin.

License

This library is licensed under the Apache 2.0 License.

amazon-neptune-tools's People

Contributors

Stargazers

Watchers

Forkers

b-chandu tugan prachigupta02 wanchen6 iansrobinson dhruvgm neurovelho nsrinivasapps lucicondescu farari7 nebulatech kmitd harrythehawk poornima777 reemamukho benlorence vasylcf mandar1010 gitvips jyoti-datadata shafaypro omardarwish-okta josephkevin surumen nithgovindasivan tomasiturralde sandeep7568 mbussa richiec alfian878787 faizaamer soorajb dev-cygnius saswata-dutta laurat0915 hyeonjuryu whitewum fcheda saikiranmun justcherie nagan27 jayesh-patel-ig nathaniellarson martandm pandulis leafsheep yugijimoh graysonchao venkatduddu kaddybrar esidialy ihiteshri12 agunnamdi maxdemarzi globeandmail lyndonbauto nethernova sudarshan12 vinayak179 vikasya beebs-systap mihaws obrazac jonasjohansson8908 calvin-ebd hughblayney dstendardi ronaks548 vivgoyal-aws ag17463 ktschmitzer gvinoth7 rrmerugu-archive shotishu devansh-amazon alastrat natmariam wadijm racydy abhishekpradeepmishra rubinakarkii sanketh-shetty-incontact jmsanchez4 ramana459 chriscoombs abhinav-97 gokhaled89 iohexo wchristiancurry loresfca aahmadai tneben ericzbeard hemalgadhiya drossi750 mf3129 haitelfatmi bechbd getlow012 ranophoenix

amazon-neptune-tools's Issues

pythongremlin errors

When trying to connect to Neptune, getting the following error:

TypeError: 'LazyHttpHeaders' object is not iterable

----- Sample code below ---
from neptune_python_utils.gremlin_utils import GremlinUtils

GremlinUtils.init_statics(globals())

gremlin_utils = GremlinUtils()

conn = gremlin_utils.remote_connection()
g = gremlin_utils.traversal_source(connection=conn)

print(g.V().limit(10).valueMap().toList())

conn.close()

How to use neptune-gremlin-client via gradle

Is neptune-gremlin-client available to use via Gradle as a dependency?
I looked up and could not find it in https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-neptune

Graphml2csv fails is the node or edge label is unspecified

Per #6, graphml2csv fails if there is not a label for a node or an edge. This issue is to made the script robust to that issue.

How to connect using .NET Core Lambda to network balancer for Amazon Neptune

I see there is the Java NeptuneGremlinClusterBuilder but I can't find how to do this with Gremlin.NET, any help appreciated.

[glue-neptune] Add batching within each Gremlin Traversal for inserts/upserts to improve throughput

The current implementation inserts each vertex and edge one at a time along with their properties. Based on Neptune best practices [1], each Gremlin traversal for inserts/upserts should apply 50-100 objects (an object being a unique vertex, unique edge, and any vertex/edge property) using a single Gremlin traversal.

[1] https://github.com/aws-samples/aws-dbs-refarch-graph/tree/master/src/writing-from-amazon-kinesis-data-streams

Add an option to csv-gremlin to allow type identifiers on numeric values

It would be useful, in the generated Gremlin, if values, such as doubles could have type identifiers added.

For example instead of g.addV('building').property('latitude', 52.12345) have an option to generate g.addV('building').property('latitude', 52.12345d) instead. This will prevent Gremlin Groovy from converting the values into BigDecimal objects, which is not desirable in all cases, when loading into a TinkerGraph or other graph backend using the standard Gremlin Groovy script parser.

A command line option to turn this addition of type specifiers would be useful.

NeptuneIAMConnection Fails | gremlinpython no longer uses tornado

Traceback (most recent call last):
  File "osm_ground_truth.py", line 26, in <module>
    from neptune_python_utils.neptune_iam import NeptuneIAMConnection
  File "/usr/local/lib/python3.6/site-packages/neptune_python_utils/neptune_iam.py", line 1, in <module>
    from neptune_python_utils.gremlin_utils import GremlinUtils
  File "/usr/local/lib/python3.6/site-packages/neptune_python_utils/gremlin_utils.py", line 31, in <module>
    from tornado.httpclient import HTTPError
ModuleNotFoundError: No module named 'tornado'

String representation of dates changed unexpectedly in neptune-export tool

Hey there!

My team uses the neptune-export JAR for exporting our Neptune database to S3 for further ingestion by other systems. Our ingestion jobs started failing on 2021-09-01 because the format of our timestamp columns changed.

Format before 2021-09-01: 2021-07-17T19:57:46.063Z
Format since 2021-09-01: Tue Jul 13 03:35:47 UTC 2021

As far as I know there isn't anything in our codebase that would influence which format is used. I did see that there were some changes related to printing dates as a string on August 31st, not sure if it is related:

https://github.com/awslabs/amazon-neptune-tools/blame/fdff97506b5e5d9bd3a29f4c49dab84622968759/neptune-export/src/main/java/com/amazonaws/services/neptune/propertygraph/schema/DataType.java#L315-L318

neptune-export compilation error

While building with the latest version (41cccfc), there's a problem with the jackson dependency:

amazon-neptune-tools/neptune-export/pom.xml

Lines 80 to 84 in 41cccfc

    
           <dependency> 
        
               <groupId>com.fasterxml.jackson.core</groupId> 
        
               <artifactId>jackson-databind</artifactId> 
        
               <version>[2.9.8,)</version> 
        
           </dependency>

In my case, it was retrieving : jackson-core-3.0.0-20190424.231835-446.jar
... where the class JsonFactory moved from the package com.fasterxml.jackson.core to com.fasterxml.jackson.core.json, so it cause this build error:

[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /git/amazon-neptune-tools/neptune-export/src/main/java/com/amazonaws/services/neptune/propertygraph/io/Format.java:[17,34] cannot find symbol
  symbol:   class JsonFactory
  location: package com.fasterxml.jackson.core
[INFO] 1 error

To fix it, in my case, I simply forced the version: <version>2.9.8</version>

neptune-python-utils: BulkLoad failed: HTTP Error 400: Bad Request

Hi,
I'm trying to make neptune (iam auth enabled) load a csv file from a s3 bucket but bulkload.load_async returns
urllib.error.HTTPError: HTTP Error 400: Bad Request
Here is my code:

def neptuneLoad(csvFile, endpoint):                                                                                                                                                                                                            
    fileUri = 's3://' + NEPTUNEBUCKET + '/' + os.path.basename(csvFile)
    bulkload = BulkLoad(source=fileUri,
                        update_single_cardinality_properties=False,
                        role=NEPTUNES3IAMROLEARN,
                        region=AWS_DEFAULT_REGION,
                        fail_on_error=True,
                        mode='NEW',
                        endpoints=endpoint,
                        parallelism='OVERSUBSCRIBE',
                        format='csv')
    load_status = bulkload.load_async()
    status, json = load_status.status(details=True, errors=True)
    logging.info(json)
    load_status.wait()

output:

curl -X POST \
    -H 'Content-Type: application/json' \
    https://<myneptuneurl>.eu-west-1.neptune.amazonaws.com:8182/loader -d '{
    "source": "s3://<mybucket>/objRef-2021-08-18T15:46:38.csv",
    "format": "csv",
    "iamRoleArn": "arn:aws:iam::<myaccountid>:role/arel-dev",
    "mode": "NEW",
    "region": "eu-west-1",
    "failOnError": "TRUE",
    "parallelism": "OVERSUBSCRIBE",
    "parserConfiguration": {
        "baseUri": "http://aws.amazon.com/neptune/default",
        "namedGraphUri": "http://aws.amazon.com/neptune/vocab/v01/DefaultNamedGraph"
    },
    "updateSingleCardinalityProperties": "FALSE"
}'
Traceback (most recent call last):
  File "ingest.py", line 376, in <module>
    objLoadId = neptuneLoad('objRef-2021-08-18T15:46:38.csv', endpoint)
  File "ingest.py", line 142, in neptuneLoad
    load_status = bulkload.load_async()
  File "/home/me/src/rrmt-backend-integration/src/ingest/neptune_python_utils/bulkload.py", line 106, in load_async
    load_id = self.__load(loader_endpoint, json_payload)
  File "/home/me/src/rrmt-backend-integration/src/ingest/neptune_python_utils/bulkload.py", line 97, in __load
    raise exc_info[0].with_traceback(exc_info[1], exc_info[2])
  File "/home/me/src/rrmt-backend-integration/src/ingest/neptune_python_utils/bulkload.py", line 89, in __load
    response = urllib.request.urlopen(req)
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

My previous code using requests to POST /loader was working.
Gremlin requests are working properly.

Mixed int / double property types causes errors in export-pg-from-queries

When exporting elementMaps of vertices where there exists a property where some nodes have values of int type and some of double type, I get the following error

java.util.concurrent.ExecutionException: java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.Double (java.lang.Integer and java.lang.Double are in module java.base of loader 'bootstrap')
	at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
	at com.amazonaws.services.neptune.propertygraph.io.QueryJob.export(QueryJob.java:93)
	at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
	at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
	at com.amazonaws.services.neptune.propertygraph.io.QueryJob.execute(QueryJob.java:52)
	at com.amazonaws.services.neptune.ExportPropertyGraphFromGremlinQueries.lambda$run$0(ExportPropertyGraphFromGremlinQueries.java:114)
	at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
	at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
	at com.amazonaws.services.neptune.ExportPropertyGraphFromGremlinQueries.run(ExportPropertyGraphFromGremlinQueries.java:88)
	at com.amazonaws.services.neptune.export.NeptuneExportRunner.run(NeptuneExportRunner.java:53)
	at com.amazonaws.services.neptune.NeptuneExportCli.main(NeptuneExportCli.java:48)
Caused by: java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.Double (java.lang.Integer and java.lang.Double are in module java.base of loader 'bootstrap')
	at com.amazonaws.services.neptune.propertygraph.schema.DataType$8.printTo(DataType.java:222)
	at com.amazonaws.services.neptune.propertygraph.io.JsonPropertyGraphPrinter.printProperty(JsonPropertyGraphPrinter.java:151)
	at com.amazonaws.services.neptune.propertygraph.io.JsonPropertyGraphPrinter.printProperty(JsonPropertyGraphPrinter.java:121)
	at com.amazonaws.services.neptune.propertygraph.io.JsonPropertyGraphPrinter.printProperties(JsonPropertyGraphPrinter.java:79)
	at com.amazonaws.services.neptune.propertygraph.io.QueryWriter.handle(QueryWriter.java:30)
	at com.amazonaws.services.neptune.propertygraph.io.QueryWriter.handle(QueryWriter.java:18)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask$ResultsHandler.handle(QueryTask.java:201)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask$ResultsHandler.handle(QueryTask.java:158)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask$StatusHandler.handle(QueryTask.java:222)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask.lambda$executeQuery$5(QueryTask.java:143)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
	at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
	at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask.executeQuery(QueryTask.java:141)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask.lambda$call$1(QueryTask.java:89)
	at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
	at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
	at com.amazonaws.services.neptune.propertygraph.io.QueryTask.call(QueryTask.java:87)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
	at java.base/java.lang.Thread.run(Thread.java:832)
An error occurred while exporting from Neptune: java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.Double (java.lang.Integer and java.lang.Double are in module java.base of loader 'bootstrap')

This seems to only happen when exporting to json, and occurs both with or without the --two-pass-analysis flag.

Minimal Example

A minimal example can be recreated by creating two nodes

v = g.addV('test1').property('testproperty', 0).id().next()
u = g.addV('test2').property('testproperty', 0.5).id().next()

and running an export that looks like

./bin/neptune-export.sh export-pg-from-queries -q result="g.V(<v_id>, <u_id>).elementMap()" ... --format json

From my perspective, it would be ideal if the query exporter didn't require a single type for values of one property, especially since Neptune itself doesn't enforce this constraint.

Relatedly: I'm using the json exporter since I can't figure out how to export column headings (the property keys) when exporting to csv. I'd be very grateful if you were able to point me in the direction of how to do this, and that would hopefully be a good short term solution to this problem.

Thank you for your time!

IAM authentication enabled on Neptune

Hello,
I was able to use this if IAM auth is not enabled on Neptune , But if it is enabled can glue-neptune handle that ? any plans to include that option ? or any other way to authenticate via glue if enabled.

Thank you.

Gremlin to neptune ES converter

Hi there,

I have a query that uses textContains, Ex:

{"gremlin":"g.V().has('props:category_simple',textContains('Project')).limit(5).toList()"}

I understand that to make this type of query, elastic search should be called.
I have successfully streamed my data to ES.

I'm trying to execute the following query

{"gremlin":"g.withSideEffect('Neptune#fts.endpoint',
                 'https://vpc-vpc-neptunestream-XXXXXXXXXXXXX.us-east-1.es.amazonaws.com')
  .withSideEffect('Neptune#fts.queryType', 'query_string')
  .E().has('props:category_simple','Neptune#fts Project~').values('props:category_simple').limit(5)"}

but i'm getting

{ "requestId": "893187e9-793a-419b-a1db-b63ca138f395", "code": "InvalidParameterException", "detailedMessage": "body could not be parsed" }

I have tried with https , http, and just the plain endpoint, but I get the same error.

Is there a gremlin (Janus) to Neptune (ES) translator for text search?

Please advise.

How to setup neptune export?

Haven't found any doc related to setting up neptune-export on an a standalone EC2 instance which fires queries for a given neptune cluster endpoint.

Can someone point me towards that?

Timeout per query level passed in String query not honored by Neptune Export job.

When a query is submitted to export job with query level timeout parameter( -q edges='g.with("scriptEvaluationTimeout",300000L).V()') it is not honored. From the AWS docs i can see for string based query submission - query level timeout should be passed as additional arg

final RequestMessage msg = RequestMessage.build("eval")
                                           .addArg(Tokens.ARGS_EVAL_TIMEOUT, 100L)
                                           .addArg(Tokens.ARGS_GREMLIN, "g.V().count()")
                                           .create();
  final List<ResponseMessage> responses = client.submit(msg);

Also did some experiments to check how query level timeout work . And found out it failed for String query. (Note: query is done in export job in similar way)

####Experiment 1:
"neptune_query_timeout" is set to 120000L, scriptEvaluationTimeout = 500L

final String query = "g.V().V().V()";
final RequestMessage.Builder request = RequestMessage.build(Tokens.OPS_EVAL)
                                                                     .add(Tokens.ARGS_GREMLIN, query)
                                                                     .add(Tokens.ARGS_BATCH_SIZE, 64)
                                                                     .add(Tokens.ARGS_SCRIPT_EVAL_TIMEOUT, 500L);
CompletableFuture<ResultSet> future =
                        client.submitAsync(client.buildMessage(request).create());
                future.get().all().get()

Query timed out after 550 ms

Experiment 2:

"neptune_query_timeout" is set to 120000L, scriptEvaluationTimeout = 500L

final GraphTraversalSource g = EmptyGraph.instance().traversal().withRemote(DriverRemoteConnection.using(cluster));
                 g.with("scriptEvaluationTimeout", 300000L).V().V().V().iterate();

Query timed out after 570 ms.

####Experiment 3:

"neptune_query_timeout" is set to 120000L, scriptEvaluationTimeout = 300000L

query timeout after 300 Secs.

####Experiment 4:

"neptune_query_timeout" is set to 120000L, scriptEvaluationTimeout = 500L
final String query = "g.with("scriptEvaluationTimeout", 500L).V().V().V()";
client.submit(query);

Query timed out after 120 sec. Even though we passed query level timeout with String query.

Parametrize drop-graph.py to do specific labels

It would be nice to be able to use drop-graph.py with parameters instead of dopping the whole graph:
drop-graph.py edge Abc
or
drop-graph.py vertex Xyz

Support for virtual graphs

Presently we can specify custom dataset for a sparql query with FROM and FROM NAMED declarations, whose default-graph is merge of graphs declared with FROM, and whose named graphs are individual named graphs declared with FROM NAMED.

This pattern works fine until we have to search among merge of small number of named graphs, by enumerating all of them with FROM. But if we have to search across merge of thousands of small named graphs (like those, trusted-by-an-authority, extracted-from-specific-resources, have-certain-provanence), then it won’t scale to mention all of those constituent graphs of custom default graph in every sparql query.

BlazeGraph allows for such virtual views through their support for Virtual Graphs.

And Virtuoso supports it through Graph Groups

Is there any possible way we can do this in neptune?

Thanks for your work.. @beebs-systap, @iansrobinson and all.

Lambda setup help

Regarding the lambda setup, can you provide some high level details for setting up the lambda. I have uploaded the jar, but I do not know what to set the handler name as. Any help is appreciated.

ParseError('unclosed token: line 1, column 32775',)graphml2csv.py

This is the error I am getting as I tried a new way of putting the data because marvel_movie_graph.graphml from @dethtron5000 does not seem to be transform as the output is 0 edges 0 nodes

Support standard packaging for neptune-python-utils

Rather than requiring us to build neptune-python-utils manually, and watch this repository to catch updates, consider publishing it as a package to PyPi.

Also, if you removed the non-standard packaging with dependencies obtained via build.sh, and instead provided a requirements.txt, we would be able to consume the package from Github in our requirements.txt using

git+https://github.com/awslabs/[email protected]#egg=neptune_python_utils&subdirectory=neptune-python-utils/neptune_python_utils

[glue-neptune] Add support for single cardinality properties

The glue-neptune library currently uses the header formats from the Neptune bulk load API as a means of providing a schema/column headers for the Data Frame used for upserting into Neptune. As of Neptune release 366, the bulk loader API now allows for defining single cardinality properties (https://docs.aws.amazon.com/neptune/latest/userguide/engine-releases-1.0.1.0.200366.0.html). We need to extend glue-neptune to support this same format and then also issue the upserts as properties(Cardinality.single,,) when used.

How to setup neptune_query_timeout while running export?

As mention in the document, I try to run below command to perform the export. I am getting the timeout exception. I know that I have to set the neptune_query_timeout parameter, but not sure how to pass this parameter.

bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output

gremlin-client dependency cannot be resolved in maven project

I'm trying to add gremlin-client dependency to my maven project, but while resolving software.amazon.neptune:gremlin-client:jar:1.0.0 dependency, it's also trying to download software.amazon.neptune:neptune-gremlin-client:pom:1.0.0 from artifactory, which is not exists.

[ERROR] Failed to execute goal on project ...: Could not resolve dependencies for project ...: Failed to collect dependencies at software.amazon.neptune:gremlin-client:jar:1.0.0: Failed to read artifact descriptor for software.amazon.neptune:gremlin-client:jar:1.0.0: Could not find artifact software.amazon.neptune:neptune-gremlin-client:pom:1.0.0 in maven-central (https://repo1.maven.org/maven2/software) -> [Help 1]

Any help is appreciated. Thanks.

RE: MemoryLimitExceededException while export neptune data.

when I try to export a user-defined query, I am getting below error.

Query:
bin/neptune-export.sh export-pg-from-queries -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output -q tapestryVisitMetadataVertex="g.V().hasLabel('tapestryVisitMetaData').order().by('lastVisitedDate').limit(20).valueMap(true)" --concurrency 4

Error:
{"requestId":"50474e09-c5d6-43bd-9e67-e98ec8855b88","code":"MemoryLimitExceededException","detailedMessage":"Query cannot be completed due to memory limitations."}

Kindly help me on this issue.

Cannot build neptune export

Compilation error:

error reading /Users/andries/.m2/repository/org/codehaus/groovy/groovy/2.5.11/groovy-2.5.11-indy.jar; zip file is empty

Tests fail as well.

create-pg-config: is it really timing out?

Running:

java -jar neptune-export.jar create-pg-config -e 127.0.0.1 -d output --log-level debug

Results in it going through all the nodes then near the end getting this:

[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp
[pool-3-thread-3] INFO com.amazonaws.services.neptune.propertygraph.io.LabelWriters - Closing file: NoOp

An error occurred while writing all nodes as CSV to devnull. Elapsed time: 180 seconds

java.util.concurrent.ExecutionException: java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {"detailedMessage":"A timeout occurred
within the script or was otherwise cancelled directly during evaluation of [46619e71-51ad-4373-9b33-cd8080adacb9]","code":"TimeLimitExceededException","requestId":"46619e71-51ad-4373-9b
33-cd8080adacb9"}

Is this really a timeout issue?

neptune-export max-content-length error on default value

When running create-pg-config I encountered the same Max frame length error as described in #68.

java.util.concurrent.CompletionException: io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

It was fixed by setting --max-content-length 2147483647 as mentioned in #68. Does it make sense to change the default value of --max-content-length to avoid this issue? Additionally, what is the best practice for setting this value? When should a lower value such as 65536 be used?

Got _pickle.PicklingError when upsert vertices in glue

I’m following this guide to insert vertices into Neptune in Glue ETL job.

I’m suffering _pickle.PicklingError: Could not serialize object: TypeError: can't pickle SSLContext objects when calling selectfields1.toDF().foreachPartition(gremlin_client.upsert_vertices('MyTest', batch_size=100))

Below is a simple glue job to reproduce this issue,

import sys
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.dynamicframe import DynamicFrame
from awsglue.transforms import RenameField, SelectFields
from io import BytesIO, StringIO
import pandas as pd
import numpy as np
import boto3
from urllib.parse import urlparse
from neptune_python_utils.gremlin_utils import GremlinUtils
from neptune_python_utils.endpoints import Endpoints
from neptune_python_utils.glue_gremlin_client import GlueGremlinClient

sc = SparkContext.getOrCreate()
sc.setLogLevel("INFO")
glueContext = GlueContext(sc)
logger = glueContext.get_logger()

logger.info(f'Before resolving options...')

args = getResolvedOptions(sys.argv,
                          ['region',
                           'neptune_endpoint',
                           'neptune_port'])

logger.info(f'Resolved options are: {args}')

spark = glueContext.spark_session
logger.info(f'###########Create df of pyshark')
data = [
    ('2987000',1.8356905714924256,19.0),
    ('2987001',1.462397997898956,0.0),
]
dataColumns = ['~id','TransactionAmt','dist1']
dataDF = spark.createDataFrame(data=data, schema = dataColumns)
dDF = DynamicFrame.fromDF(dataDF, glueContext, 'Data')
logger.info(f'###########Schema is {dDF.schema()}')
logger.info(f'###########Iterate the dataset’)
def printRows(rows):
    for row in rows:
        print(f'Processing row is {row}')
dDF.toDF().foreachPartition(printRows)


GremlinUtils.init_statics(globals())
endpoints = Endpoints(neptune_endpoint=args['neptune_endpoint'], neptune_port=args['neptune_port'], region_name=args['region'])
logger.info(f'Initializing gremlin client to Neptune ${endpoints.gremlin_endpoint()}.')
gremlin_client = GlueGremlinClient(endpoints)
logger.info(f'#####TESTING gremlin conn')
gremlin_utils = GremlinUtils(endpoints)
conn = gremlin_utils.remote_connection()
g = gremlin_utils.traversal_source(connection=conn)
logger.info(f'Gremlin vertices: {g.V().limit(10).valueMap().toList()}')
conn.close()
logger.info(f'#####Gremlin conn test is successful')
logger.info(f'Initializing gremlin client to Neptune ${endpoints.gremlin_endpoint()}.')
selectfields1 = SelectFields.apply(frame = dDF, paths = dataColumns, transformation_ctx = "selectfields1")
selectfields1.toDF().foreachPartition(gremlin_client.upsert_vertices('MyTest', batch_size=100))

The issue can be reproduced by using both glue 1.0 + py3 and 2.0. The neptune_python_utils lib zip is built from latest source of repo by build.sh script.

Any suggestion is appreciated.

Unable to pass AWS region for configuring refresh agent based on fetch strategy as GetEndpointsFromNeptuneManagementApi

Please add a way to pass aws region from the base services using the gremlin-client jar with address fetch strategy as GetEndpointsFromNeptuneManagementApi.
Getting this error :- Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.

Even if we set a local env variable of AWS_REGION, getting the error, so i guess it would be a better thing if we can configure the AmazonNeptuneClient in GetEndpointsFromNeptuneManagementApi's getAddresses method as -> AmazonNeptuneClientBuilder.standard().withRegion(region).build()
Adding PR :- #99

neptune-python-utils: not able to authentificate with IAM DB Authentication

Hi, I'm using neptune_python_utils.gremlin_utils, Release 1.0.0 (#145)

I'm using a role to have permission ("neptune-db:*") to access my neptune cluster.
running code from readme:

from neptune_python_utils.gremlin_utils import GremlinUtils
GremlinUtils.init_statics(globals())
gremlin_utils = GremlinUtils()
conn = gremlin_utils.remote_connection()
g = gremlin_utils.traversal_source(connection=conn)
print(g.V().limit(10).valueMap().toList())
conn.close()

I get:
Exception: Failed to connect to server: HTTP Error code 403 - Forbidden
Endpoint is given by env var: $NEPTUNE_CLUSTER_ENDPOINT $NEPTUNE_CLUSTER_PORT

Any help would be appreciated.
Thanks

add import directly from graphml and json without transforming to csv first.

An error occurred while writing all nodes as JSON (Neptune Streams format) to stream.

I have ran the export-neptune-to-elasticsearch on two nonprod neptune clusters, dev and a staging, only to keep getting these errors:

An error occurred while writing all nodes as JSON (Neptune Streams format) to stream.
An error occurred while exporting all nodes.
An error occurred while exporting property graph.
and then multiple of these:
java.util.concurrent.ExecutionException: java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {
"code": "TimeLimitExceededException",
"requestId": "f5304c39-0000-0000-0000-000000000000",
"detailedMessage": "A timeout occurred within the script during evaluation."

The batch job states as completed successfully, however, only 2.2M of the 64M records have been exported into opensearch.

I have modified the neptune_query_timeout parameter timeout to max, 2147483647 but that's made no difference.

Is there anyway to troubleshoot the errors?

Java export command does not return an error code in case of a failure

We are using neptune-export.jar to run an export job from our AWS Neptune and in case of a failure during an execution command returns 0 code instead of an error code, which makes it quite difficult to track any failures during our pipeline execution.

Ways to reproduce:

Run
java -jar neptune-export.jar for export

In case of an error, example:

An error occurred while creating property graph config. Elapsed time: 32 seconds
java.lang.IllegalStateException: org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persist

The return code is 0, I would expect to receive something different since the command has failed.

We are using the latest version - https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-export/bin/neptune-export.jar

Bug on Neptune Export: exception prevents cloned cluster from being deleted

We are using Neptune Export to generate exports of our graph data. Everything works perfectly most days. The clone cluster gets created, is used, and then is automatically deleted by the tool.

Sometimes, when an internal exception occurs (for example, a connection error), the tool halts and exits without deleting the clone cluster. This causes expensive machines to be kept running until someone manually deletes it. There should be some checks in the tool to perform the deletion in exception scenarios.

Example expected log output when the cluster is correctly deleted:

Example log output when there's an exception and the cluster is not deleted:

Support for Named Graphs

Are you planning support for accessing Named Graphs in Neptune?

https://docs.aws.amazon.com/neptune/latest/userguide/best-practices-sparql-graph.html

Temporary creds expire

Hi ,

I’m using Neptune python module ( Iam enabled) to load large volume of data multiple times in a single glue job . Looks like credentials are expiring after an hour , can they be refreshed automatically?

Thanks ,
DInesh k.

[neptune-export] Better handling for SPARQL query or request errors

I run the following command to export an RDF Graph:

java -Xms16g -Xmx16g -jar neptune-export.jar export-rdf -e orpheus-6-instance-1.cfm103hnhdrl.us-east-2.neptune.amazonaws.com -p 8182 --output files -d /home/ec2-user/neptune-export --region us-east-2 --format neptuneStreamsJson --use-ssl --use-iam-auth

After running for around 4 minutes or so the export terminates (with a partial export) with the following stack trace:

java.lang.RuntimeException: org.eclipse.rdf4j.query.QueryEvaluationException: Unkown record type: 123
at com.amazonaws.services.neptune.rdf.NeptuneSparqlClient.executeQuery(NeptuneSparqlClient.java:167)
at com.amazonaws.services.neptune.rdf.io.ExportRdfGraphJob.lambda$execute$0(ExportRdfGraphJob.java:34)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
at com.amazonaws.services.neptune.rdf.io.ExportRdfGraphJob.execute(ExportRdfGraphJob.java:31)
at com.amazonaws.services.neptune.ExportRdfGraph.lambda$run$0(ExportRdfGraph.java:63)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
at com.amazonaws.services.neptune.ExportRdfGraph.run(ExportRdfGraph.java:55)
at com.amazonaws.services.neptune.export.NeptuneExportRunner.run(NeptuneExportRunner.java:44)
at com.amazonaws.services.neptune.NeptuneExportCli.main(NeptuneExportCli.java:48)
Caused by: org.eclipse.rdf4j.query.QueryEvaluationException: Unkown record type: 123
at org.eclipse.rdf4j.repository.sparql.query.SPARQLTupleQuery.evaluate(SPARQLTupleQuery.java:59)
at com.amazonaws.services.neptune.rdf.NeptuneSparqlClient.executeQuery(NeptuneSparqlClient.java:127)
... 10 more
Caused by: java.io.IOException: Unkown record type: 123
at org.eclipse.rdf4j.query.resultio.binary.BinaryQueryResultParser.parse(BinaryQueryResultParser.java:188)
at org.eclipse.rdf4j.query.resultio.AbstractTupleQueryResultParser.parseQueryResult(AbstractTupleQueryResultParser.java:48)
at org.eclipse.rdf4j.http.client.SPARQLProtocolSession.getTupleQueryResult(SPARQLProtocolSession.java:699)
at org.eclipse.rdf4j.http.client.SPARQLProtocolSession.sendTupleQuery(SPARQLProtocolSession.java:369)
at org.eclipse.rdf4j.repository.sparql.query.SPARQLTupleQuery.evaluate(SPARQLTupleQuery.java:56)
... 11 more
An error occurred while exporting from Neptune: org.eclipse.rdf4j.query.QueryEvaluationException: Unkown record type: 123

Neptune export-pg generate edge files with missing headers

I am trying to replicate the database from one AWS region to other and using this utility to export the data from master DB.

The utility runs fine but when I try to upload the files to other Neptune DB, using Neptune bulk loader, the edge inserts fail with errors:

"errorCode" : "PARSING_ERROR",
"errorMessage" : "Record has more columns than header",
"fileName" : "s3://{edge file location}.csv",
"recordNum" : 60

Steps to replicate issue:

Export the data using
bin/neptune-export.sh export-pg --log-level error -e {endpoint} -d ~/Downloads/
Run bulk loader command for other DB:

 curl -X POST     -H 'Content-Type: application/json'     http://{endpoint}:8182/loader -d '
             {
                "source" : "s3://{csv files folder location}",
                "format" : "csv",
                "iamRoleArn" : "{iam role}",
                "region" : "us-west-2",
                "failOnError" : "FALSE"
              }'

Check the status of the load :

Output:

            {
                "fullUri" : "s3://{s3 bcuket}/sync/edges/{filename}.csv",
                "runNumber" : 1,
                "retryNumber" : 2,
                "status" : "LOAD_FAILED",
                "totalTimeSpent" : 0,
                "startTime" : 1564090152,
                "totalRecords" : 80497,
                "totalDuplicates" : 0,
                "parsingErrors" : 80265,
                "datatypeMismatchErrors" : 0,
                "insertErrors" : 232
            }
        ],
        "errors" : {
            "startIndex" : 1,
            "endIndex" : 3,
            "loadId" : "{loadid}",
            "errorLogs" : [
                {
                    "errorCode" : "PARSING_ERROR",
                    "errorMessage" : "Record has more columns than header",
                    "fileName" : "{file location}",
                    "recordNum" : 60
                },
                {
                    "errorCode" : "PARSING_ERROR",
                    "errorMessage" : "Record has more columns than header",
                    "fileName" : "{file location}",
                    "recordNum" : 61
                },
                {
                    "errorCode" : "PARSING_ERROR",
                    "errorMessage" : "Record has more columns than header",
                    "fileName" : "{file location}",
                    "recordNum" : 62
                }```


Fix tried:
1. Provide the config file
2. One observation was there were 100s of newline inputs in exported edge file. I had to remove them but still it failed.


**Our Sample Edge data looks like**:
**headers**: ~id,~label,~from,~to,createdBy:string,createdTimestamp:date,weight:double,updatedBy:string,endDate:date,updatedTimestamp:date

**Data**:
```edgeId_1,fromVertexId,toVertexId,"batch",2019-06-25T08:45:54Z,1.0,"batch",2019-07-08T00:00:00Z,2019-07-08T08:40:57Z

Export utility never created 3 header in first column: updatedBy:string,endDate:date,updatedTimestamp:date headers, I added later to fix the issue . Not all rows of data will have these values.

Size of data:
there are 80323 edges in the data for this label.

[export-neptune-to-elasticsearch] Error exporting nodes

When using the neptune to elasticsearch solution I found that the elasticsearch index appeared to be missing a lot of data. Going back through the logs I see that the export neptune batch job succeeded but contained the following stacktrace


[main] INFO com.amazonaws.services.neptune.propertygraph.RangeFactory - Limit: 319015, Size: 159508

[pool-6-thread-2] INFO com.amazonaws.services.neptune.propertygraph.NodesClient - __.V().range(0L,159508L).project("id","label","properties").by(T.id).by(T.label).by(__.valueMap())

[pool-6-thread-1] INFO com.amazonaws.services.neptune.propertygraph.NodesClient - __.V().range(159508L,319015L).project("id","label","properties").by(T.id).by(T.label).by(__.valueMap())

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer

	at com.amazonaws.services.neptune.propertygraph.metadata.DataType$5.printTo(DataType.java:81)

	at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printRecord(NeptuneStreamsJsonPropertyGraphPrinter.java:133)

	at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printRecord(NeptuneStreamsJsonPropertyGraphPrinter.java:114)

	at com.amazonaws.services.neptune.propertygraph.io.NeptuneStreamsJsonPropertyGraphPrinter.printProperties(NeptuneStreamsJsonPropertyGraphPrinter.java:71)

	at com.amazonaws.services.neptune.propertygraph.io.NodeWriter.handle(NodeWriter.java:36)

	at com.amazonaws.services.neptune.propertygraph.io.NodeWriter.handle(NodeWriter.java:18)

	at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.handle(ExportPropertyGraphTask.java:91)

	at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask$CountingHandler.handle(ExportPropertyGraphTask.java:132)

	at com.amazonaws.services.neptune.propertygraph.NodesClient.lambda$queryForValues$1(NodesClient.java:89)

	at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:272)

	at com.amazonaws.services.neptune.propertygraph.NodesClient.queryForValues(NodesClient.java:87)

	at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.run(ExportPropertyGraphTask.java:71)

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

	at java.lang.Thread.run(Thread.java:748)

[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2020-08-17 22:49:35.527651] [0x00000022][0x00007f67a76887c0] [info] [kinesis_producer.cc:200] Created pipeline for stream "neptune-export-cdde6d20"

[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2020-08-17 22:49:35.527741] [0x00000022][0x00007f67a76887c0] [info] [shard_map.cc:87] Updating shard map for stream "neptune-export-cdde6d20"

[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2020-08-17 22:49:35.552757] [0x00000022][0x00007f67a1e7b700] [info] [shard_map.cc:148] Successfully updated shard map for stream "neptune-export-cdde6d20" found 8 shards

[gremlin-driver-loop-1] ERROR org.apache.tinkerpop.gremlin.driver.Handler$GremlinResponseHandler - Could not process the response

io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

	at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.protocolViolation(WebSocket08FrameDecoder.java:426)

	at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.decode(WebSocket08FrameDecoder.java:286)

	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475)

	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1224)

	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1271)

	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)

	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)

	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)

	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)

	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)

	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)

	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

	at java.lang.Thread.run(Thread.java:748)

java.util.concurrent.CompletionException: io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

	at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:375)

	at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947)

	at org.apache.tinkerpop.gremlin.driver.ResultSet.one(ResultSet.java:119)

	at org.apache.tinkerpop.gremlin.driver.ResultSet$1.hasNext(ResultSet.java:171)

	at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:178)

	at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:165)

	at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:146)

	at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:131)

	at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal.nextTraverser(DriverRemoteTraversal.java:112)

	at org.apache.tinkerpop.gremlin.process.remote.traversal.step.map.RemoteStep.processNextStart(RemoteStep.java:80)

	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)

	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)

	at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:205)

	at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:272)

	at com.amazonaws.services.neptune.propertygraph.NodesClient.queryForValues(NodesClient.java:87)

	at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.run(ExportPropertyGraphTask.java:71)

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

	at java.lang.Thread.run(Thread.java:748)

Caused by: io.netty.handler.codec.http.websocketx.CorruptedWebSocketFrameException: Max frame length of 65536 has been exceeded.

	at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.protocolViolation(WebSocket08FrameDecoder.java:426)

	at io.netty.handler.codec.http.websocketx.WebSocket08FrameDecoder.decode(WebSocket08FrameDecoder.java:286)

	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475)

	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1224)

	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1271)

	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)

	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)

	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)

	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)

	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)

	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)

	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)

	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)

	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)

	at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

	... 1 more

At the end of the log I can also see the following which says that most of the nodes of the graph were not exported:

Config file : /neptune/results/1597704571888/config.json
--
 Source:
  Nodes: 319015
  Edges: 739118
Export:
  Nodes: 74
  Edges: 739118

Can you advise on how I could solve this issue?

'illegal multibyte sequence' error on long list of numbers as string

In my Neo4j database I have nodes with GIS coordinates stored as a property. These are long lists of pairs of ~13 digit numbers separated by commas. Neo4j stores these as strings, I believe in UTF-8 format.

I used the APOC command to export the graph in graphml format, and now I'm trying to convert it to CSV for upload into Neptune using this script. I get an 'illegal multibyte sequence' error

UnicodeDecodeError('cp932', b'5.4397040833, 133.328841248 35.439539551, 133.32875060...3757508892 35', 332, 333, 'illegal multibyte sequence')

A couple of weird things occur at the end... (1) the single quote after "35" and (2) the "332, 333" which are not in the data. So I guess those are error codes (but I really have no idea). My first guess is that the real problem is that the list is just too long (because it's occurring at numbers instead of the Asian script text, but I really don't know),

Any information on what could be generating this error and how to avoid it?

[Export Neptune to ElasticSearch] StartExportCommand does nothing.

When I run the command from README

  --function-name arn:aws:lambda:eu-west-1:000000000000:function:export-neptune-to-kinesis-xxxx \
  --region eu-west-1 \
  /dev/stdout

I do successfully get a

{"jobName": "export-neptune-to-kinesis-xxxxxxxx-0000000000000, "jobId": "ac34f6eb-ad3c-4d9b-8019-ba24da15b5a0"}{
    "ExecutedVersion": "$LATEST", 
    "StatusCode": 200
}

However, when I check on the cloudWatch for the lambda function, I just see the following log and no data is loaded in to my Elastic instance.

[INFO]	2021-10-28T17:32:30.140Z	17dbdf72-137b-4ea0-80c6-72063ae2af51	Command: df -h && rm -rf neptune-export.jar && wget https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-export/bin/neptune-export.jar -nv && export SERVICE_REGION="us-west-2" && java -Xms16g -Xmx16g -jar neptune-export.jar export-pg -e dev-cluster.cluster-xxxxxxxxxxxx.us-west-2.neptune.amazonaws.com -p 8182 -d /neptune/results --output stream --stream-name neptune-export-xxxxxxxxx --region us-west-2 --format neptuneStreamsJson --use-ssl --use-iam-auth --concurrency 2 --scope all

I have resolved all the visible errors but seems like nothing is being loaded into the cluster, is there any other way I can debug this?

Inconsistent indent

Inconsistent indent in this Python graphml2csv.py: tabs and blanks are mixed with each other. This causes failing to run the code.

Tried to change the neptune_query_timeout to 120000 and got the exception

Please, ignore it

Neptune Export doesn't escape newline character (\n)

We are using Neptune Export to generate exports of our graph data and we've noticed recently that it isn't able to escape the newline character \n.

For example, if we were exporting two nodes that had the following properties:
First Node: { 'name': ['John\n'], 'country': ['US'] }
Second Node: { 'name': ['James'], 'country': ['UK'] }

The exported CSV would end up like so (with four lines instead of just three):
name:string,country:string
"John
","US"
"James","UK"

neptune-export.sh Unable to access jarfile export-pg

Hello,

I was trying to run this command (of course with my Neptune cluster name) bin/neptune-export.sh export-pg -e neptunedbcluster-xxxxxxxxxxxx.cluster-yyyyyyyyyyyy.us-east-1.neptune.amazonaws.com -d /home/ec2-user/output and I am getting this error message, Unable to access jarfile export-pg.
what I did was to:

Clone this project from git to a local directory.
Run the command above

Anyone have any idea why I am getting this error?

graphml2csv - "'NoneType' object has no attribute 'encode'"

If a vertex/node in GraphML has a Key defined but no value, the Python script errors with the message:
"'NoneType' object has no attribute 'encode'".

The following Node can't be processed:

 <node id="#26:2">
      <data key="labelV">Document</data>
      <data key="searchname"/>
      <data key="sendername"/>
      <data key="startdate"/>
      <data key="deliverydate"/>
      <data key="search"/>
      <data key="documentid"/>
      <data key="id"/>
      <data key="idfrom"/>
      <data key="transmittimestamp"/>
      <data key="searchlocationcode"/>
      <data key="documentdatefrom"/>
      <data key="tagid"/>
      <data key="idto"/>
      <data key="format"/>
      <data key="deliverydateto"/>
      <data key="documenttype">Order</data>
      <data key="deliverydatefrom"/>
      <data key="receivername"/>
      <data key="filegennumber"/>
      <data key="documentdateto"/>
      <data key="senderid"/>
      <data key="enddate"/>
      <data key="receiverid"/>
      <data key="searchcode"/>
      <data key="documentnumber"/>
      <data key="documentdate"/>
      <data key="searchdeliverydate"/>
    </node>

graphml2csv has issue in handling mutiple labels

Date parsing and timeout errors

Hello, I am working through two different bugs, I am not sure if they are related and can break the issues out for better tracking purposes if desired.

The first is an issue with handling datetimes, I stored dates as a String within the Graph, but they can vary between ISO8601 format to various different types of strftime or datetime, maybe even some Epoch timestamps. So the first error I encounter is related to that - Is there a way to override this processing either on ES or within the Job?

[ERROR] BulkIndexError: ('34 document(s) failed to index.', [{'update': {'_index': 'amazon_neptune', '_type': '_doc', '_id': 'c14df75bfb7c90c770ea0bf841169329', 'status': 400, 'error': {'type': 'mapper_parsing_exception', 'reason': "failed to parse field [predicates.CreatedAt.value] of type [date] in document with id 'c14df75bfb7c90c770ea0bf841169329'. Preview of field's value: '2019-05-24 10:54:53.529000+00:00'", 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'failed to parse date field [2019-05-24 10:54:53.529000+00:00] with format [strict_date_optional_time||epoch_millis]', 'caused_by': {'type': 'date_time_parse_exception', 'reason': 'date_time_parse_exception: Failed to parse with all enclosed parsers'}}}

The next issues I run into are what appear to be transient timeout issues, I am unsure if they are related to any of the data types within the graph itself, we do have lists (and write them in as strings) as well as JSON docs (.dumps) that we write into the graph, again as strings. Sometimes the Lists are written as a stringified empty list ([]) - unsure if that is affecting it as well. Is there a way to configure the export utility to treat everything as a string and stop trying to validate / transform the types?


[kpl-daemon-0003] INFO com.amazonaws.services.kinesis.producer.LogInputStreamReader - [2021-03-15 19:29:07.463988] [0x00000023][0x00007f78c988d700] [info] [processing_statistics_logger.cc:129] (neptune-export-ee598750) Average Processing Time: 34.363128 ms
An error occurred while writing all nodes as JSON (Neptune Streams format) to stream. Elapsed time: 172 seconds
An error occurred while exporting all nodes. Elapsed time: 195 seconds
[main] INFO org.apache.tinkerpop.gremlin.driver.ConnectionPool - Signalled closing of connection pool on Host{address=neptunedbcrg.cluster-calvzb374vjl.us-east-1.neptune.amazonaws.com/10.199.1.175:8182, hostUri=wss://neptunedbcrg.cluster-calvzb374vjl.us-east-1.neptune.amazonaws.com:8182/gremlin} with core size of 5
An error occurred while exporting property graph. Elapsed time: 198 seconds
java.util.concurrent.ExecutionException: java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {     "detailedMessage": "A timeout occurred within the script or was otherwise cancelled directly during evaluation of [24658221-460c-4835-a24e-0e5e36ff04c4]",     "code": "TimeLimitExceededException",     "requestId": "24658221-460c-4835-a24e-0e5e36ff04c4" }
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphJob.updateFileSpecificLabelSchemas(ExportPropertyGraphJob.java:135)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphJob.lambda$export$1(ExportPropertyGraphJob.java:114)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphJob.export(ExportPropertyGraphJob.java:89)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphJob.lambda$execute$0(ExportPropertyGraphJob.java:64)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:74)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:67)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphJob.execute(ExportPropertyGraphJob.java:63)
at com.amazonaws.services.neptune.ExportPropertyGraph.lambda$run$0(ExportPropertyGraph.java:109)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:41)
at com.amazonaws.services.neptune.util.Timer.timedActivity(Timer.java:34)
at com.amazonaws.services.neptune.ExportPropertyGraph.run(ExportPropertyGraph.java:85)
at com.amazonaws.services.neptune.export.NeptuneExportRunner.run(NeptuneExportRunner.java:44)
at com.amazonaws.services.neptune.NeptuneExportCli.main(NeptuneExportCli.java:48)
Caused by: java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {     "detailedMessage": "A timeout occurred within the script or was otherwise cancelled directly during evaluation of [24658221-460c-4835-a24e-0e5e36ff04c4]",     "code": "TimeLimitExceededException",     "requestId": "24658221-460c-4835-a24e-0e5e36ff04c4" }
at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:375)
at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947)
at org.apache.tinkerpop.gremlin.driver.ResultSet.one(ResultSet.java:119)
at org.apache.tinkerpop.gremlin.driver.ResultSet$1.hasNext(ResultSet.java:171)
at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:178)
at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:165)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:146)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:131)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal.nextTraverser(DriverRemoteTraversal.java:112)
at org.apache.tinkerpop.gremlin.process.remote.traversal.step.map.RemoteStep.processNextStart(RemoteStep.java:80)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:205)
at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:273)
at com.amazonaws.services.neptune.propertygraph.NodesClient.queryForValues(NodesClient.java:103)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.call(ExportPropertyGraphTask.java:86)
at com.amazonaws.services.neptune.propertygraph.io.ExportPropertyGraphTask.call(ExportPropertyGraphTask.java:29)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {     "detailedMessage": "A timeout occurred within the script or was otherwise cancelled directly during evaluation of [24658221-460c-4835-a24e-0e5e36ff04c4]",     "code": "TimeLimitExceededException",     "requestId": "24658221-460c-4835-a24e-0e5e36ff04c4" }
at org.apache.tinkerpop.gremlin.driver.Handler$GremlinResponseHandler.channelRead0(Handler.java:259)
at org.apache.tinkerpop.gremlin.driver.Handler$GremlinResponseHandler.channelRead0(Handler.java:198)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at org.apache.tinkerpop.gremlin.driver.Handler$GremlinSaslAuthenticationHandler.channelRead0(Handler.java:124)
at org.apache.tinkerpop.gremlin.driver.Handler$GremlinSaslAuthenticationHandler.channelRead0(Handler.java:68)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at org.apache.tinkerpop.gremlin.driver.handler.WebSocketClientHandler.channelRead0(WebSocketClientHandler.java:89)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1526)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1275)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1322)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
... 1 more
An error occurred while exporting from Neptune: java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {     "detailedMessage": "A timeout occurred within the script or was otherwise cancelled directly during evaluation of [24658221-460c-4835-a24e-0e5e36ff04c4]",     "code": "TimeLimitExceededException",     "requestId": "24658221-460c-4835-a24e-0e5e36ff04c4" }

export-neptune-to-elasticsearch: "Elastic Search version less then 7.x is not supported. Current version - 1.0.0"

Hi,

I have created an Opensearch domain using the default version of 1.0.0.

The export-neptune-to-elasticsearch stack creates a kinesis-to-elasticsearch lambda which fails when run with the error: "Elastic Search version less then 7.x is not supported. Current version - 1.0.0"

Please update the code to use Opensearch instead of Elasticsearch.

[Neptune-Export] Release neptune-export.jar artifact

Is it possible that we package neptune-export.jar artifact with different java environment and put it in the release?
Currently, we can only see source codes in the release.

	<dependency>
	<groupId>com.fasterxml.jackson.core</groupId>
	<artifactId>jackson-databind</artifactId>
	<version>[2.9.8,)</version>
	</dependency>