Code Monkey home page Code Monkey logo

kbqa's Introduction

QAmp

Svitlana Vakulenko, Javier D. Fernandez, Axel Polleres, Maarten de Rijke and Michael Cochez. Message Passing for Complex Question Answering over Knowledge Graphs. CIKM. 2019

Requirements

  • Python 3.6

  • tensorflow==1.11.0

  • keras==2.2.4

  • pyHDT (for accesssing the DBpedia Knowledge Graph)

  • elasticsearch==5.5.3 (for indexing entities and predicate labels of the Knowledge Graph)

  • pymongo (for storing the LC-QuAD dataset)

  • flask (for the API)

Datasets

  • LCQUAD 5,000 pairs of questions and SPARQL queries

Setup

It is not trivial to set up the environment. You need to:

  1. Create virtual environment and install all dependencies (to install CUDA, TF, Keras and friends follow https://medium.com/@naomi.fridman/install-conda-tensorflow-gpu-and-keras-on-ubuntu-18-04-1b403e740e25)
conda create -n kbqa python=3.6 pip
conda activate kbqa
pip install -r requirements.txt
  1. Install HDT API:
git clone https://github.com/webdata/pyHDT.git
cd pyHDT/
./install.sh
  1. Download DBPedia 2016-04 English HDT file and its index from http://www.rdfhdt.org/datasets/
  2. Follow instructions in https://github.com/svakulenk0/hdt_tutorial to extract the list of entities (dbpedia201604_terms.txt) and predicates
  3. Index entities and predicates into ElasticSearch
  4. Download LC-QuAD dataset from http://lc-quad.sda.tech
  5. Import LC-QuAD dataset into MongoDB
sudo service mongod start

Run

see notebooks

Benchmark

python final_benchmark_results.py

Citation

@inproceedings{DBLP:conf/cikm/VakulenkoGPRC19,
  author    = {Svitlana Vakulenko and
               Javier David Fernandez Garcia and
               Axel Polleres and
               Maarten de Rijke and
               Michael Cochez},
  title     = {Message Passing for Complex Question Answering over Knowledge Graphs},
  booktitle = {Proceedings of the 28th {ACM} International Conference on Information
               and Knowledge Management, {CIKM} 2019, Beijing, China, November 3-7,
               2019},
  pages     = {1431--1440},
  year      = {2019},
  url       = {https://doi.org/10.1145/3357384.3358026},
  doi       = {10.1145/3357384.3358026},
  timestamp = {Mon, 04 Nov 2019 11:09:32 +0100}
}

kbqa's People

Contributors

svakulenk0 avatar dependabot[bot] avatar

Stargazers

Samuel avatar Karuna Bhaila avatar Nicolay Rusnachenko avatar Ruijie Wang avatar Sonam Sharma avatar LLLiaomeng avatar rebacca avatar  avatar Tianyu Star avatar Meizhen avatar Aleksandr Perevalov avatar Ankit Dangi avatar Pratik Barjatiya avatar Ankur Singh avatar  avatar  avatar  avatar Feit avatar  avatar  avatar Dan avatar  avatar zc-young avatar Alex Gaskell avatar  avatar  avatar yayuanzi8 avatar Aaditya Ura (looking for PhD Fall’24) avatar D. Lowl avatar Nikita avatar Lj Miranda avatar xuqiang avatar Zhaoyan avatar Goldsmith avatar Vladimir Gurevich avatar counten avatar  avatar Yiheng Shu avatar Inno Fang avatar  avatar  avatar Mike DuPont avatar Ashrya Agrawal avatar  avatar Lapis-Hong avatar Bryan avatar Vera Provatorova avatar Hossein Taghi-Zadeh avatar Robin avatar ziwen  avatar ChaoPeng avatar Shulin Cao avatar  avatar Anton Alekseev avatar Dan Amador  avatar Seder(方进) avatar  avatar Ranjodh Singh avatar Ibrahim Sharaf avatar Timothy Liu avatar Eugene avatar Dat Vi Thanh avatar Nguyen Thanh Dat avatar zywu avatar  avatar  avatar tmylla avatar 史方明 avatar Yunan Zhang avatar Merrill-Xue avatar pengfei avatar Charlotte avatar Diego Siqueira avatar Allen avatar Rahul Khairnar avatar Xiaoyue Wang avatar  avatar  avatar freyaya avatar  avatar  avatar Priyansh Trivedi avatar 是小飞鱼不是飞鱼 avatar  avatar

Watchers

James Cloos avatar Grzegorz Warzecha avatar Pratik Barjatiya avatar  avatar Gaole He (何高乐) avatar paper2code - bot avatar

kbqa's Issues

Prediction on a custom dataset

Hi,
I have a query if the above solution can be effectively leveraged for a custom-built dataset? If yes, can u share the pipeline to be followed to build a custom query based KG.
Thanks

'hdt.HDTDocument' object has no attribute 'configure_hops'

Hi,
In the notebooks 2_entities_KBQA I got errors when trying to run this function

def evaluate_entity_ranking(_e_spans, indices, top_n):
    '''
    Estimate ranking accuracy:
    n_samples <int> size of the sample questions pool
    top_n <int> threshold for the number of top entities 
    '''
    n_correct_entities, n_correct_entities_1hop = 0, 0
    n_correct_answers_1hop = 0
    # match entities
    for i in indices:
        top_e_ids = []
        
        # entities index lookup
        for span in _e_spans[i]:
            for match in e_index.match_label(span, top=top_n):
                top_e_ids.append(match['_source']['id'])
        
        if set(correct_entities_ids[i]).issubset(set(top_e_ids)):
            n_correct_entities += 1
        
        # extract a subgraph for top entities
        kg = HDTDocument(hdt_path+hdt_file)
        # all predicates: 1 hop
        kg.configure_hops(1, [], namespace, True)
        entities, _, _ = kg.compute_hops(top_e_ids)
        if set(correct_entities_ids[i]).issubset(set(entities)):
            n_correct_entities_1hop += 1
        if set(correct_answers_ids[i]).issubset(set(entities)):
            n_correct_answers_1hop += 1
        kg.remove()

The HDTDocument class from hdt package doesn't have attributes configure_hops and compute_hops
Could you please provide me some information about those two methods?
Thank you.

Indexing procedure

Hi! What is the procedure for step 4 'Index entities and predicates into ES'? util/index.py requires a entity-frequency file, is that something we're supposed to create from dbpedia2016-04en.hdt and then feed into it? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.