Code Monkey home page Code Monkey logo

gdbviews's Introduction

GDBViews

Code for views in graph database system (Neo4j) as part of my Master's thesis, along with some instructions and notes.

main.Main: The main program, which opens a simple terminal for creating and using views with the extended language. Contains the node and edge tables.

main.QueryParser: The parser for the tree generated by View.g4. Used to set up meta-data and update the dependency table when parsing a query. This also uses main.TableEntry and main.EntryData which make up the dependency table.

main.Neo4jGraphConnector: The interface for the connection between the application and the local Neo4j database. A config file is required to create the database. Contains methods which return node and edge identifiers.

Test files: All test methods are included inside the main program. For materialized views there is the automatic re-writing using node identifiers, and for non-materialized views, there is no automatic rewriting, but it is described in text. Likewise for the baseline queries they are hard-coded.

jess: All methods or files that use or mention Jess were used when exploring discrimination networks (rule-based systems) to deal with materialized views, which offered an incremental view update. This also includes GraphEngine, NodeEnum, EdgeEnum, and any rule files in the /jess/ directory. However, the space complexity was too high and we abandoned the idea.

gen: This is the output destination for View.g4. View.g4 is the language extension used for the project and for the rest of the code to work, its output must be in gen. If you change the language you must recompile it, and as main.QueryParser.java extends one of these classes (ViewBaseListener) you will need to update it accordingly. Check ANTLR docs on how to do this.

Instructions for running:

No command line args. I was lazy and changed the database sizes in the code itself (in main, change the size variable for whichever db you are using). A config file is required (see ../test/config for an example) for the database sizes. The reference to this config file can be found in main.Neo4jGraphConnector.java.

In order to test view creation/usage easily, a set of files is required. You can still manually test with the enabled terminal, but if you have a set of view definitions or a set of view use queries, then it would be easier to do automatically. See /test/initFileExample.txt for an example for view creation, and ViewUses2.txt for view usage files. Note that * is the comment character for these. An option exists in the method initFile2 to write the identifiers to disk (for later use when testing view usage and maintenance): this will write to a directory within /test/SIZE/ where SIZE is the size of your database (you may need to initialize the directory first).

In order to load back the data, use the loadTablesFromFiles(size) method to re-populate the node and edge tables. In order to re-populate the dependency tables, use createMetaInfoFromQueries(path_to_init_file) (example commented within the main method). Then you can run testUses(size), in which you may need to modify to adjust paths to your init/use files.

Neo4j enterprise 4.0.4 libraries should be in /lib/. You also need the jess library (jess.jar and jsr94.jar) along with the antlr (currently using antlr-4.8-complete.jar).

Data structure info:

main.Main: nodeTable and edgeTable store as the key the view name that is used, along with the set of node or edge identifiers returned by the view. pathTable is used but ... I do not think it does anything anymore.

main.QueryParser: This walks down the tree and executes enter/exit based on which components are entered. Look for ANTLR documentation for details. I set up meta-data during these enter/exit methods and keep the dependencyTable updated if it is a view creation. For view usage, variables are set up (symbols used for view use, set of conditions) so that main.Main can know which set of identifiers it can pull from the node or edge tables. For view updates, the dependency table is referenced and a set of outdated views is returned to main.Main, and most steps are commented with details/logic.

main.DependencyTable is a hashtable with a graph component label as the key (Person, PARENT_OF, Post, etc) and a main.TableEntry object as the value. main.TableEntry contains a list of main.EntryData which are associated with itself. For instance, a main.TableEntry :Post may have several main.EntryData, which differ due to the set of conditions. main.EntryData contains a condition list (which uniquely identifies it) and a list of views which depend on it. More info on these structures are in their java files, mainly in TableEntry.

All other structures used in main.QueryParser.java have relevant commenting in the code itself.

gdbviews's People

Contributors

yutingy avatar

Stargazers

 avatar  avatar

Watchers

 avatar

gdbviews's Issues

certain view names cause bugs

for testing purposes name the views something that is easy to distinguish (v1, v2, etc). don't use names like "match" or "return"... etc, because they cause bugs when splitting the query string (when re-writing queries).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.