ldbc / gcore-spark Goto Github PK
View Code? Open in Web Editor NEWImplementation of the G-CORE graph query language on Spark
License: Apache License 2.0
Implementation of the G-CORE graph query language on Spark
License: Apache License 2.0
Introduce a new optional IN subclause where "IN v1" is the new optional subclause of GROUP in CONSTRUCT. This means that v1 and v2 are going to belong to the same domain, so v1 and v2 will be the same now if expression foo and bar are the same, and therefore we can potentially also construct a self-referencing edge here (the use-case whereby Hannes Voigt championed this feature is graph summarization).
This should start and be defined when the system stably works and we start testing it for trajectory storage & analysis applications.
For example, if we want to select paths from a repo of stored paths that pass through two nodes, cut them out and then feed them into GROUP BY paths we would at least need:
Describe the bug
The following query does not work : Construct () match ()
The problem seems to be the empty node.
For a expression of the form: construct (n) match (n) where n.name = "";
The parser assumes that n.name = n.""
| + MatchClause
| | + CondMatchClause
| | | + SimpleMatchClause
| | | | + GraphPattern
| | | | | + Vertex
| | | | | | + Reference [n]
| | | | | | + ObjectPattern
| | | | | | | + True$ [true, GcoreBoolean$]
| | | | | | | + True$ [true, GcoreBoolean$]
| | | | + DefaultGraph$
| | | + PropertyRef [n.""]
G-core must provide a command line tool to execute operation on a database.
When use a RPQ, only allow use the reverse symbol for one per label, and not in a parenthesis
When a graph is saved, the application show an error and don't save the graph in hard disk.
This means creating the path dataframes, which contain a src_id, dst_id and edge_list. Currently, these are not created yet in CONSTRUCT.
This feature applies a selection after grouping on the binding table for a construct pattern.
Expression types match for binary expressions (if possible to check)
(?) An edge is between two distinct vertices
Variable bindings in SimpleMatchClause have different names. Note: Bindings can be re-used across multiple SimpleMatchClauses. HOWEVER, edges should not be reused.
Ambiguous labeling of entities. For example, in queries such as:(v1:L1)->(v2)<-(v3), (v1:L2)v1 is labeled differently in the two patterns.
.Eliminate similar queries? For example:(v1:L1)->(v2), (v3:L1)->(v4)v1 and v3 are the same Vertex, v2 and v4 will be the same vertex, their edges are the same too, it’s a repeated query, which we translate into a join over the two edges.
Validate that all keys in edgeRestriction (GraphSchema) are present in the graph. Also validate that all values in edgeRestrictions are present in the graph.
ALL PATHS can only be used with stored paths
(?) Throw error or warn the user if, after label inference, an entity can have more than one label. This translates into a UNION ALL of all labels for that entity.
Each match variable must be matched on only one graph. Validation should go in MapBindingToGraph.
All variables in an edge or path pattern should be part of the same graph. Validation should go in MapBindingToGraph.
(?) An EXISTS subquery must have at least one common variable with the main MATCH clause.
UnionAll operator is applied on two relations with the same header.
Property exists for given label, or exists for given entity type.
Property types match with Expression types.
(?) No two Table’s contain the same id.
A constructed entity must be of the same type as its matched counterpart, if they are the same variable (this can be checked in CreateGroupingSets).
A specific GROUP clause can only be used with unbound variables. This can be checked in CreateGroupingSets
Check that each variable in CONSTRUCT ends up in the end with at most one label - if the label is missing, then we can create a new one in VertexCreate. This check can be done in CreateGroupingSets.
Are aggregate expressions allowed in the MATCH’s WHERE clause?
An entity can only be GROUP-ed once, or else the GROUP-ings must be combined.
We need an option to access labels of objects, like where n.label = "Label"
The current grammar does not allow a WHERE clause for the entire MATCH clause, when OPTIONAL patterns are included.
https://stackoverflow.com/questions/31639059/how-to-add-license-to-an-existing-github-project
We should also add at the top of each file a copyright and license notice.
we need to wait with this until it is clear that we agree on Apache as a license (and possibly to a copyrights transfer)
For some RPQ expression using parentheses, the query crashes.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A successful query execution.
If a node variable x is created in a CONSTRUCT pattern, then a subsequent pattern (to the right) could use it, e.g. CONSTRUCT (x GROUP y.foo), (x)->(y)
This can be implement as a rewrite:
GRAPH VIEW tmp as CONSTRUCT…
MATCH … ON tmp
When use a RPQ, only allow use the negation symbol for one per label, and not in a parenthesis
Wrap the code in a gcore-spark module, that a single import statement initializes the gcore-spark subsystem, reads the default catalog, and then is ready to execute queries by adding some gcore(string) : Graph method to the sparkSession.
Define syntax and semantics for paths operations.
Support CREATE GRAPH x, which indicates that graph x is persistent, and that the catalog has to be changed also in a persistent way.
Also support DROP GRAPH x, which indicates that a graph x that is persistent has to be deleted from Spark and the catalog.
CREATE GRAPH x should have a default semantics (for example, it should be default rule that indicates where graph x should be stored and in which format).
CREATE GRAPH x should also give the possibility to the user to specify some parameters such as directory where x is going to be stored, format for x, …
It could be something like CREATE GRAPH x (directory="/foo", format="parquet")
G-core must provide a textual format to represent the output of a query
Describe the bug
The following query is allowed: "CREATE 'nuevo' CONSTRUCT (n) MATCH (n)"
To Reproduce
Expected behavior
Show a parser error, because it must be "CREATE GRAPH"
Implement RPQ with KleeneBounds
Ex. MATCH (n:Person)-/ALL p<:knows*{2}>/->(m:Person)
MATCH … -/ pat*/- … ON (
CONSTRUCT g,(src)-[pat]-(dst)
MATCH (src)-..pattern…-(dst),.. )
The rewrite for weighted PATH pat = (src)-...pattern..-(dst),...COST ..Y.. used in MATCH … -/ pat* COST x/- … is therefore:
MATCH … -/ pat* COST x/- … ON (
CONSTRUCT g,(src)-[pat {weight:=..Y..}]-(dst)
MATCH (src)-..pattern…-(dst),.. )
again, somehow we need to ensure that COST x now gets filled not with the hopcount, but with the SUM(weight). The GraphX implementation already has some support for this, but it needs to be triggered. This extra info is probably best a property attached to the algebra tree nodes, so the GraphX code generation can pick it up and generate the appropriate stuff
Describe the solution you'd like
When a user insert a command, the command is stored. Then when the user press the up or down arrow key, the system shows the latest commands.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.