Code Monkey home page Code Monkey logo

32de-python's Introduction

MetaExp: Interactive Explanation and Exploration of Large Knowledge Graphs

Build Status Coverage Status

MetaExp is an open-source, interactive framework for graph exploration that can automatically discover hidden knowledge in large graph databases. It incorporates the domain knowledge of the user to define a unique and personalized view on the graph.

Contact and citation

Behrens, F., Bischoff, S., Ladenburger, P., Rückin, J., Seidel, L., Stolp, F., Vaichenker, M., Ziegler, A., Mottin, D., Aghaei, F., Müller, E., Preusse, M., Müller, N. & Hunger, M. (2018). MetaExp: Interactive Explanation and Exploration of Large Knowledge Graphs. WWW

Description

We present MetaExp, a system that assists the user during the exploration of large knowledge graphs, given two sets of initial nodes. At its core, MetaExp presents a small set of meta-paths to the user, which are sequences of relationships among node types. Such meta-paths do not overwhelm the user with complex structures, yet they preserve semantically-rich relationships in a graph. MetaExp engages the user in an interactive procedure, which involves simple meta-paths evaluations to infer a user-specific similarity measure. This similarity measure incorporates the domain knowledge and the preferences of the user, overcoming the fundamental limitations of previous methods based on local node neighborhoods or statically determined similarity scores. Our system provides a user-friendly interface for searching initial nodes and guides the user towards progressive refinements of the meta-paths. The system is demonstrated on three datasets, Freebase, a movie database, and a biological network.

Installation

To deploy our system including neo4j, the neo4j graph algorithm component, the UI and our server install docker on your system and run deployment/docker-deployment.sh. This will install a clean version from the alpha-dev and the master branches and doesn't include your local code changes. If the API should be served ssl encrypted, set the environment variable METAEXP_HTTPS to true and provide api.crt and api.key in the https folder.

Development

We have a collection of some helpful scripts you might want to use when developing for this project.

Scripts

Build and run

To build your own local code use deployment/build-*.sh /path/to/code (e.g. deployment/build-server.sh .) and to run a single container deployment/run-*.sh [PORT]. By default Neo4j browser is listening on port 7474, bolt is available on port 7687 and our server is listening on port 8000 for all hosts. If you start the additional neo4j containers with run-neo4j-helmholtz.sh and run-neo4j-commerzbank.sh, they are listening on the ports +10 for Helmholtz and +20 for the Commerzbank data. All the neo4j containers are based on the neo4j-graph-algorithms image. To change the default port simply specify the PORT parameter when running deployment/run-*.sh [PORT]. We use redis for our meta paths. Start the container by executing deployment/run-redis.sh. After startup of the redis container simply execute localhost:8000/test-import in your browser. This command fills the redis store with Helmholtz meta paths.

Updating files in containers

If you want to update any files in your container you can use the deployment/copy-to-container.sh [CONTAINER] [PATH/IN/CONTAINER] command. All you have to do is specify the container name or id you want to copy your updated files to. The second parameter is optional. If you have changed the path to the project files in the container you need to specify this path here. WARNING This will overwrite any changes made in the container.

Tutorials for installing Docker: Mac, Windows and Ubuntu.

Usage

This is the server component of the MetaExp system.

Development

Logging Guideline

Use MetaExp-Logger. For example if you wanted to equip the module Example with a logger, you would simply create a child logger by logging.getLogger('MetaExp.Example'). If you wanted to use a logger for each class, you would define it as self.logger = logging.getLogger('MetaExp.{}'.format(__class__.__name__)).

Contributors

Freya Behrens, Sebastian Bischoff, Pius Ladenburger, Julius Rückin, Laurenz Seidel, Fabian Stolp, Michael Vaichenker and Adrian Ziegler.

License

This work is licensed under MIT License.

32de-python's People

Contributors

baschdl avatar derintergalaktischegrosshorst avatar feeds avatar fswt avatar gittihab avatar jexp avatar laurenzse avatar mkuuwaujinga avatar mpreusse avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

32de-python's Issues

Fix CORS origins check

Fix check of origin of request to API. In the moment it's a * which isn't super secure:

CORS(app, supports_credentials=True, resources={r"/*": {"origins": "*"}})

Corresponding code was written but doesn't work.

Add test set in preference classification

Add option to specify if a test set (where the performance of the preference classifier is measured) is used in the DomainScoring class.

  • Separate tests examples in _extract_training_data_labels method
  • Do we also need a validation set?

Development dependencies

We might want to think about adding development dependencies as some libraries might not be required in deployment (e.g. pytest-cov for test coverage). Therefore we would need to separate the dev and deploy environments slightly.

Parameterised neo4j bash scripts

Add parameters like path to data, memory settings etc. to the bash script starting neo4j. This is necessary when our system is used on other infrastructures.

Sync Docker Container with Code

AS A Developer
I WANT to be able to sync my changes project files into the docker container
SO THAT I don't have to rebuild the whole image to test every small change.

This could be done either via docker cp which simply copies the files into the container (this is not working for me on Mac) or by mounting a volume into the Docker container.

TypeError on login

[2018-04-04 07:39:10,274] ERROR in app: Exception on /login [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.5/dist-packages/flask_cors/extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.5/dist-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/32de-python/api/server.py", line 53, in login
    meta_path_loader = MetaPathLoaderDispatcher().get_loader(session['dataset'])
  File "/32de-python/util/meta_path_loader_dispatcher.py", line 50, in get_loader
    return self.dataset_to_loader[dataset]
TypeError: unhashable type: 'dict'

Use semaphore for timing

Use a semaphore when calculate the time a user needs to rate on batch of meta-paths instead of using this ugly time and time_old in the session dict. This can lead to the normal concurrency problems.

Use Docker Volumes

Use volumes for rated_datasets and notebooks so that our changes are persistent.

Reuse meta-path ratings

As a user I want to be able to reuse meta-path ratings and config settings
in order to to compare different input sets based on the same user-defined similarity.

This could be done by using user accounts.

  • Login
  • Customized DB-settings
  • Relate meta-path-ratings to account

TypeError in similarity test

________________ SimilarityTestCase.test_similarity_calculation ________________

self = <tests.similarity_score_test.SimilarityTestCase testMethod=test_similarity_calculation>

    def test_similarity_calculation(self):
        test_paths = \
            [MetaPathRating(MetaPath(nodes=[1, 2], edges=[3]), 3, 1), MetaPathRating(MetaPath(nodes=[1, 2, 1], edges=[3, 4]), 1, 3),
             MetaPathRating(MetaPath(nodes=[1, 3, 2], edges=[1, 1]), 5, 2), MetaPathRating(MetaPath(nodes=[1, 3, 2], edges=[1, 1]), 4, 7),
             MetaPathRating(MetaPath(nodes=[1, 3, 2], edges=[1, 1]), 6, 3)]
    
>       similarity = SimilarityScore.calculate_similarity(test_paths)
E       TypeError: calculate_similarity() missing 1 required positional argument: 'meta_path_ratings'

tests/similarity_score_test.py:18: TypeError

@juliusrueckin That's from your changes, isn't it?

Setup Coveralls

I have tried setting up Coveralls using the Python coveralls plugin – but it seems to have some difficulties running in Docker. Maybe someone can look over it.

There is a branch coveralls which should already contain some configuration. The repository is already connected to Coveralls.io. Should you have trouble seeing the Repo/Organization there try to make your membership in the Organization public (More on how to solve this here – especially around the part where the issue was originally closed).

Setup and use logging for server

The server does not support debug logging, yet. This should be implemented.

AS A Developer I WOULD like to be able to view debug outputs in a log file of a running server SO THAT access to the stdout is not necessary and debugging is possible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.