philippchr / convex Goto Github PK

Code for our CIKM 2019 paper. As far as we know, CONVEX is the first unsupervised method for conversational question answering over knowledge graphs. A demo and our benchmark (and more) can be found at

Home Page: https://convex.mpi-inf.mpg.de/

License: MIT License

Python 99.37% Shell 0.63%

conversational-ai knowledge-base knowledge-graph question-answering

convex's People

Contributors

Stargazers

Watchers

Forkers

lboth iamrishiraj romainclaret abcp4 kartavyakothari wangdongde linqian66 bigchen8013 munirabobaker

convex's Issues

Where to download the dataset

Hi,

I have downloaded the Wikidata2018_09_11. Where could I download the ConvQuestions question-answering pairs? I clicked the buttons shown in the image but they did not work.

what is 'qualifiers'?

May I know what is qualifiers? Can you give an example of it?

KG link does not work

Hi @PhilippChr ,

I am unable to download the KG, I see a similar issue #13 . Can you please share the link ?

Thank you for your help.

Different knowledge bases

Hi,

I played a bit with your benchmark. I am finding the exact same individual results across all domain options, even with a different number of frontiers and particularly with different knowledge bases from http://www.rdfhdt.org/datasets/.

I tried with python27 and python37.

Additionally, I am not finding the same results as in your paper :-(

Thank you for your time,
Best regards

Custom knowledge graph

Hi,

We have custom knowledge graph built. How can I use this repository on my custom knowledge graph. Currently we have saved our graph in Neo4j

pip install hdt does not work in python 3.7 macOS

when I try "pip install hdt" in my terminal, I encounter the problem("pybind11" has been installed):

Collecting hdt
  Using cached hdt-2.3.tar.gz (229 kB)
Collecting pybind11==2.2.4
  Using cached pybind11-2.2.4-py2.py3-none-any.whl (145 kB)
Building wheels for collected packages: hdt
  Building wheel for hdt (setup.py) ... error
  ERROR: Command errored out with exit status 1:
command: /usr/local/anaconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/dc/dd3sxplx0h97kmd_xhrqx2h40000gn/T/pip-install-zbfuwsf7/hdt/setup.py'"'"'; __file__='"'"'/private/var/folders/dc/dd3sxplx0h97kmd_xhrqx2h40000gn/T/pip-install-zbfuwsf7/hdt/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/dc/dd3sxplx0h97kmd_xhrqx2h40000gn/T/pip-wheel-73maa_ez
       cwd: /private/var/folders/dc/dd3sxplx0h97kmd_xhrqx2h40000gn/T/pip-install-zbfuwsf7/hdt/
  Complete output (45 lines):
  /usr/local/anaconda3/lib/python3.7/site-packages/setuptools/dist.py:724: UserWarning: Module pybind11 was already imported from /usr/local/anaconda3/lib/python3.7/site-packages/pybind11/__init__.py, but /private/var/folders/dc/dd3sxplx0h97kmd_xhrqx2h40000gn/T/pip-install-zbfuwsf7/hdt/.eggs/pybind11-2.2.4-py3.7.egg is being added to sys.path
    pkg_resources.working_set.add(dist, replace=True)

requests.get('https://tagme.d4science.org/tagme/tag‘）error

Hi, I run the code, and found that the program always got error in the follow line. Do you know what happens?

JSON files in data folder

Hi Philipp,

Besides the Wikidata dump, could you please provide the entity dictionary and the predicate dictionary? The following JSON files.

# identifier_predicates
with open( "data/identifier_predicates.json", "r") as data:
	identifier_predicates = json.load(data)
# label_dict
with open( "data/label_dict.json", "r") as data:
	label_dict = json.load(data)
# predicate_frequencies_dict
with open( "data/predicate_frequencies_dict.json", "r") as data:
	predicate_frequencies_dict = json.load(data)
# entity_frequencies_dict
with open( "data/entity_frequencies_dict.json", "r") as data:
	entity_frequencies_dict = json.load(data)
# statements_dict
with open( "data/statements_dict.json", "r") as data:
	statements_dict = json.load(data)

Best regards
Sirui

How to get the Knowledge Graph

Hi, How can I get the knowledge graph for ConvQuestions?

Running problem

Hi,
I got the problem when running and couldn't get the results.txt file.
Very confused about this information...Could you provide some hints?

Predicate Bitmap in 1 min 59 sec 342 ms 10 us
Count predicates in 7 min 47 sec 408 ms 825 us
Count Objects in 8 min 20 sec 819 ms 140 us Max was: 543914579
Bitmap in 16 sec 259 ms 55 us
Bitmap bits: 7230127313 Ones: 1190479283

Best Regards,

Results not changing

Hi,

My question is related to #2 but as it is closed I have to repeat it. I get the same results regardless of the domain and parameters set in settings.json. As you mentioned, I removed cached data (by deleting files from data folder) but the code is not running without these files. So, I removed parts of the code that loads those files. Now results are different from before but they remain still the same if I change parameters. I was wondering if you could help me with resolving this issue.

Thanks in advance!

Node Embedding

Hi, in Equation (2), CONVEX initialises the word embeddings of a question by using word2vec but how does CONVEX initialise node embeddings?

About settings.json

In the file settings.json, the parameter tagMe_token is an empty string. How can I get its value?THX

The existentials question

Hello,After reading your paper and coding, I have a question. There are many existentials questions in your dataset, such as "Is the rain song and immigrant song there?",so How did you get the answer to this kind of questions?
I found the following code in your coding.
ranked_answers = [{'answer': "yes", 'answer_score': 1.0, 'rank': 1}, {'answer': "no", 'answer_score': 0.5, 'rank': 2}]
I didn't understand it. Could you explain it？thinks.

About requirements

Hi，

Are you running on linux? I can't install the HDT package on Windows, do you have a solution? In addition, when I use python -m spacy download en_vectors_web_lg, the download is not successful, there is no response for a long time or the requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, '远程主机强迫关闭了一个现有的连接。', None, 10054, None)) is prompted.

May I know what data is stored in these files?

where is the knowledge graph

May I know how to get the knowledge graph? Thx.

Count Questions

Hi, it seems that the model only selected one entity from the knowledge graph as the answer. How does it handle count questions, e.g. "Led Zeppelin had how many band members?"

A question about ConvQuestion

Hello, I have carefully looked at your ConvQuestion dataset and found that not all the question's answer can be found in the wikidata.

For example, the fifth dialogue in the books domain data of the training set contains the question: "what character joins Harry Potter after being saved by him? "Although I find the answer Hermione Granger in Harry Potter's characters, I didn't find the relevant information of "being saved by".
The sixth dialogue in the tv_series domain data of the training set contains the question: " What's the name of the two copies? " I didn't find any information about the problem.
In addition to the above two, there are other unanswerable questions
Is this because wikidata has been updated to delete the required information, or does the dataset itself contain unanswerable questions? Or the answer can be found in wikidata, but I didn't find it?
THX

Predicate index

Hi,

Why there is a "predicate_index"? Thank you!

if not qualifier_predicate_nodes.get(qualifier_statement['qualifier_predicate']['id']):
	# the qualifier_predicate did not occur yet => index 0 and new entry
	qualifier_predicate_nodes[qualifier_statement['qualifier_predicate']['id']] = 1
	predicate_index = 0
else:
	# the qualifier_predicate already occured => fetch the next index available and increase the saved one
	predicate_index = qualifier_predicate_nodes[qualifier_statement['qualifier_predicate']['id']]
	qualifier_predicate_nodes[qualifier_statement['qualifier_predicate']['id']] += 1

Tagme and Entity recognition

Hello, I want know what is the proportion of errors caused by tagme's inability to recognize entities in your work? Because I tried to use tagme to identify the first question of many conversations in the test set, and found that some entity were unrecognizable. For example, the first question in test set, "what is the name of the writer of the secret garden?", The entity expected to be recognized is "the secret garden", but tagme can only recognize "secret garden",the former is book, the latter is not.

In addition, in the dialogue with conv_id 8974 in the test set, its first problem is the referential pronoun "he", which obviously can not identify the entity. Is this a data set error?

wikidata2018_09_11.hdt.gz down

Hello,

Sadly, http://gaia.infor.uva.es/hdt/wikidata/wikidata2018_09_11.hdt.gz, is down :( I contacted them but no response yet.

I was wondering if it would be possible to share that file from your side? Based on the information provided here: http://www.rdfhdt.org/datasets/ I expect it to be about 30gb.

For example, we could use http://academictorrents.com or anything.

Best regards