GDBViews

Code for views in graph database system (Neo4j) as part of my Master's thesis, along with some instructions and notes.

main.Main: The main program, which opens a simple terminal for creating and using views with the extended language. Contains the node and edge tables.

main.QueryParser: The parser for the tree generated by View.g4. Used to set up meta-data and update the dependency table when parsing a query. This also uses main.TableEntry and main.EntryData which make up the dependency table.

main.Neo4jGraphConnector: The interface for the connection between the application and the local Neo4j database. A config file is required to create the database. Contains methods which return node and edge identifiers.

Test files: All test methods are included inside the main program. For materialized views there is the automatic re-writing using node identifiers, and for non-materialized views, there is no automatic rewriting, but it is described in text. Likewise for the baseline queries they are hard-coded.

jess: All methods or files that use or mention Jess were used when exploring discrimination networks (rule-based systems) to deal with materialized views, which offered an incremental view update. This also includes GraphEngine, NodeEnum, EdgeEnum, and any rule files in the /jess/ directory. However, the space complexity was too high and we abandoned the idea.

gen: This is the output destination for View.g4. View.g4 is the language extension used for the project and for the rest of the code to work, its output must be in gen. If you change the language you must recompile it, and as main.QueryParser.java extends one of these classes (ViewBaseListener) you will need to update it accordingly. Check ANTLR docs on how to do this.

Instructions for running:

No command line args. I was lazy and changed the database sizes in the code itself (in main, change the size variable for whichever db you are using). A config file is required (see ../test/config for an example) for the database sizes. The reference to this config file can be found in main.Neo4jGraphConnector.java.

In order to test view creation/usage easily, a set of files is required. You can still manually test with the enabled terminal, but if you have a set of view definitions or a set of view use queries, then it would be easier to do automatically. See /test/initFileExample.txt for an example for view creation, and ViewUses2.txt for view usage files. Note that * is the comment character for these. An option exists in the method initFile2 to write the identifiers to disk (for later use when testing view usage and maintenance): this will write to a directory within /test/SIZE/ where SIZE is the size of your database (you may need to initialize the directory first).

In order to load back the data, use the loadTablesFromFiles(size) method to re-populate the node and edge tables. In order to re-populate the dependency tables, use createMetaInfoFromQueries(path_to_init_file) (example commented within the main method). Then you can run testUses(size), in which you may need to modify to adjust paths to your init/use files.

Neo4j enterprise 4.0.4 libraries should be in /lib/. You also need the jess library (jess.jar and jsr94.jar) along with the antlr (currently using antlr-4.8-complete.jar).

Data structure info:

main.Main: nodeTable and edgeTable store as the key the view name that is used, along with the set of node or edge identifiers returned by the view. pathTable is used but ... I do not think it does anything anymore.

main.QueryParser: This walks down the tree and executes enter/exit based on which components are entered. Look for ANTLR documentation for details. I set up meta-data during these enter/exit methods and keep the dependencyTable updated if it is a view creation. For view usage, variables are set up (symbols used for view use, set of conditions) so that main.Main can know which set of identifiers it can pull from the node or edge tables. For view updates, the dependency table is referenced and a set of outdated views is returned to main.Main, and most steps are commented with details/logic.

main.DependencyTable is a hashtable with a graph component label as the key (Person, PARENT_OF, Post, etc) and a main.TableEntry object as the value. main.TableEntry contains a list of main.EntryData which are associated with itself. For instance, a main.TableEntry :Post may have several main.EntryData, which differ due to the set of conditions. main.EntryData contains a condition list (which uniquely identifies it) and a list of views which depend on it. More info on these structures are in their java files, mainly in TableEntry.

All other structures used in main.QueryParser.java have relevant commenting in the code itself.

yutingy / gdbviews Goto Github PK

gdbviews's Introduction

GDBViews

Instructions for running:

Data structure info:

gdbviews's People

Contributors

Stargazers

Watchers

gdbviews's Issues

graph changes do not reset the set of affected views

insertion and deletion behavior will vary for certain views

certain view names cause bugs

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent