Code Monkey home page Code Monkey logo

rgcn-explainer's Introduction

Introducing RGCNExplainer: The relations that make a difference in a relational world

RGCNExplainer is an extension of GNNExplainer tailored for relational graphs. Its primary purpose is to provide explanations for predictions made in the node classification task within such graph structures. This work also encompasses experiments involving knowledge injection into explanations through various methods of mask initialization.

Installation

To set up the necessary conda environment and install the required dependencies, follow these steps:

  1. Create a conda environment:
conda create -n RGCNExplainer python=3.9.16
conda activate RGCNExplainer
pip install -r requirements.txt
pip install . 

Ensure that setup.py and the kgbench folder are located in the root directory of your project. 2. If you intend to conduct experiments related to hyperparameter tuning using WANDB (Weights and Biases), export your API key and log in:

export WANDB_API_KEY='YOUR_API_KEY'
wandb login

Training your RGCN model

To train your RGCN model, execute the following command, providing the name of the knowledge graph dataset as an argument:

python3 RGCNExplainer/rgcn.py 'aifb'

RGCN-Explainer

The pipeline of RGCNExplainer:

RGCNExplainer_model

In order to explain the RGCN prediction on one or more nodes: The arguments that are to be added are the following:

  1. Name of the dataset (in the given examples: 'aifb', 'amplus', 'dbo_gender', 'mdgenre').
  2. Mask initialization (choose from: 'normal', 'overall_frequency', 'relative_frequency', 'inverse_relative_frequency', 'domain_frequency', 'range_frequency', 'Domain_Knowledge').
  3. If the mask initialization method is 'Domain_Knowledge', provide a relation ID as an integer.
  4. If using 'Domain_Knowledge' with baseline domain knowledge, choose between 'forward' and 'backward'.
  5. Explain all nodes: --explain all (if True, explain all nodes).
  6. Explain one node: --explain_one (if True, explain a random node).
  7. Explain a stratified random sample of nodes: --random_sample (if True, explain a stratified per-class random sample of nodes).
  8. If using --random_sample, specify the number of samples per class with --num_samples_per_class int.
  9. If you want to sweep over the different possible hyperparameters: --sweep
  10. If you want to exclude the most frequent relation (typically 'rdf:type') from the explanation: --kill_most_freq_rel.

For example, to obtain an explanation for one node:

python RGCNExplainer/mainRGCN_explainer.py 'aifb' 'normal' --explain_one

Or to get explanations for a stratified sample of nodes:

python RGCNExplainer/mainRGCN_explainer.py 'aifb' 'normal' --random_sample --num_samples_per_class 5  

Hyperparameter Configuration

To adjust hyperparameter settings, refer to the configuration file:

RGCNExplainer/config.py

Result Analysis

For in-depth analysis, including a table reporting explanation metrics and a barplot comparing relation distribution between the full and explanation subgraph, use the following commands:

For a single explanation:

python RGCNExplainer/Result_analysis_one_node.py 'aifb' --node_idx 5731

For analysis of explanation results at the class level:

python RGCNExplainer/Result_analysis_per_classes.py 'aifb'

Relation Attribution

Another method introduced in this work is relation attribution, which investigates the impact of different relation types on RGCN model performance. Two modalities, 'forward' and 'backward', are explored. 'Forward' predicts node class using only edges of one relation type, while 'backward' iteratively excludes one relation type from predictions made on edges of all other relation types.

To perform experiments with the relation attribution method, run the code with the dataset name and chosen modality as arguments. For example:

python3 RGCNExplainer/Relation_Attribution.py 'aifb' 'backward'

DATASET

The experiments conducted in this work utilized datasets introduced in KGBENCH.

To use RGCNExplainer with a different knowledge graph, the dataset must be converted to the KGBENCH format following the instructions found in:

datasets-conversion/scripts/README.md

A preliminary analysis of the dataset can be conducted by using the script in:

RGCNExplainer/statistics_datasets.ipynb

PAPER

The Master Thesis associated with this repository is available as RGCNExplainer.pdf.

For any inquiries or further information, please refer to the associated paper and feel free to open a Issue or contact the author.

rgcn-explainer's People

Contributors

traopia avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.