Code Monkey home page Code Monkey logo

owl2vec-star's Introduction

OWL2Vec-Star

pypi Documentation Status

OWL2Vec*: Embedding OWL ontologies

Features

OWL2Vec* v0.2.0 exposes a CLI with two subcommands after installation, which allows you to perform two main programs. You can also run the two original python programs without installation (see the requirements in setup.py).

Installation command:

$ make install

Standalone

This command will embed one ontology. It can be configured by the configuration file default.cfg. See the examples and comments in default.cfg for the usage.

Running command:

$ owl2vec_star standalone --config_file default.cfg

Running program:

$ python OWL2Vec_Standalone.py --config_file default.cfg

Note: Different from the experimental codes, the standalone command has implemented all OWL ontology relevant procedures in python with Owlready, but it also allows the user to use pre-calculated annotations/axioms/entities/projection to generate the corpus.

Standalone Multi

This command will embed multiple ontologies into one embedding model, where the documents from multiple ontologies will be merged. One use case example is embedding all the conference relevant ontologies of the OAEI conference track at once.

Running command:

$ owl2vec_star standalone-multi --config_file default_multi.cfg

Running program:

$ python OWL2Vec_Standalone_Multi.py --config_file default_multi.cfg

Note: Different from the standalone command, this command for multiple ontologies does NOT allow the pre-calculated or external annotations/axioms/entities/projection.

Accessing Embeddings

The embedding model is saved in $embedding_dir (or $cache_dir/output if $embedding_dir is not set). The class IRI vector can be accessed:

>> import gensim
>> from owlready2 import *
>> model = gensim.models.Word2Vec.load(word2vec_embedding_file)
>> onto = get_ontology(onto_file).load()
>> classes = list(onto.classes())
>> c = classes[0]
>> c.iri in model.wv.index_to_key
>> iri_v = model.wv.get_vector(c.iri)

The class word vector (of words of the class label defined by e.g., rdfs:label) can be accessed in a similar way with averaging:

>> from nltk import word_tokenize
>> from numpy as np
>> label = c.label[0]
>> text = ' '.join([re.sub(r'https?:\/\/.*[\r\n]*', '', w, flags=re.MULTILINE) for w in label.lower().split()])
>> words = [token.lower() for token in word_tokenize(text) if token.isalpha()]
>> n = 0
>> word_v = np.zeros(model.vector_size)
>> for word in words:
       if word in model.wv.index_to_key:
           word_v += model.wv.get_vector(word)
           n += 1
>> word_v = word_v / n if n > 0 else word_v

Note: the class IRI vector and the class word vector can be independently used, or concatenated.

Publications

Main Reference

  • Jiaoyan Chen, Pan Hu, Ernesto Jimenez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, and Ian Horrocks. OWL2Vec*: Embedding of OWL ontologies. Machine Learning, Springer, 2021. [PDF] [@Springer] [Collection] [Codes in package or folder]

Applications with OWL2Vec*

  • Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks, Denvar Antonyrajah, Ali Hadian, Jaehun Lee. Augmenting Ontology Alignment by Semantic Embedding and Distant Supervision. European Semantic Web Conference, ESWC 2021. [PDF] [LogMap Matcher work]
  • Ashley Ritchie, Jiaoyan Chen, Leyla Jael Castro, Dietrich Rebholz-Schuhmann, Ernesto Jiménez-Ruiz. Ontology Clustering with OWL2Vec*. DeepOntonNLP ESWC Workshop 2021. [PDF]

Preliminary Publications

  • Ole Magnus Holter, Erik Bryhn Myklebust, Jiaoyan Chen and Ernesto Jimenez-Ruiz. Embedding OWL ontologies with OWL2Vec. International Semantic Web Conference. Poster & Demos. 2019. [PDF]
  • Ole Magnus Holter. Semantic Embeddings for OWL 2 Ontologies. MSc thesis, University of Oslo. 2019. [PDF] [GitLab]

Case Studies

Data and codes for class membership prediction on the Healthy Lifestyles (HeLis) ontology, and class subsumption prediction on the food ontology FoodOn and the Gene Ontology (GO), are under the folder case_studies/.

Credits

Code under owl2vec_star/rdf2vec/, which mainly implement walking strategies over RDF graphs, is derived from pyRDF2Vec (version 0.0.3, last access: 03/2020) with revision.

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template. Many thanks to Vincenzo Cutrona for preparing this package.

owl2vec-star's People

Contributors

chenjiaoyan avatar ernestojimenezruiz avatar irygaev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.