asknowqa / lc-quad2.0 Goto Github PK

A large data set of natural language queries with corresponding SPARQL queries for Wikidata and Dbpedia2018

lc-quad2.0's Issues

Request for Templates and Reproduction Code for LC_QuAD 2.0 Dataset

I hope this message finds you well. My name is Breno W. Carvalho and I am an AI researcher currently working on a project related to LLMs (Language Models). I've been deeply inspired by the work you've done with the LC_QuAD 2.0 dataset, and I believe it offers invaluable insights and potential for further research in this domain.

As part of my research, I am aiming to create a new dataset that extends upon the work you've pioneered with LC_QuAD 2.0. To ensure the fidelity and rigor of my work, I was hoping to gain access to:

The templates used for generating the diverse set of SPARQL queries in LC_QuAD 2.0.
Any code or scripts that would help in reproducing the findings/results presented in your paper.

Having access to these resources would greatly aid in ensuring the accuracy and quality of my extended dataset. Of course, all due credits will be provided to your team and the LC_QuAD 2.0 dataset in any publications or presentations arising from my research.

I understand the concerns related to sharing proprietary or sensitive information. If there are any reservations or conditions, please let me know, and I'd be more than happy to accommodate or discuss further.

Thank you for considering my request. I truly appreciate the work you've put into the LC_QuAD 2.0 dataset and believe that together, we can further the boundaries of what's possible in the realm of AI research.

Warm regards

Could you also provide answers?

Questions about DBpedia adopted

Dear authors, in the paper entitled "LC-QuAD 2.0: A Large Dataset for Complex Question Answering over Wikidata and DBpedia", you mentioned that the knowledge graph adopted is DBpedia 2018, however, I didn't find this version of DBpedia in the homepage (https://wiki.dbpedia.org/develop/datasets), and the latest dump (DBpedia Dataset 2019-08-30 (Pre-Release)) is also suitable for LC-QuAD 2.0. Could you please tell me where to download DBpedia 2018? Is it the link (http://downloads.dbpedia.org/repo/lts/wikidata/) provided in your paper? I tried to import the turtle files into virtuoso, but it failed due to some errors in the files (mappingbased-properties-reified.ttl). Thanks a lot!

Collection of some problems in the first 600 lines of the training dataset

These are some problems with the training dataset that I found by skimming the first 600 lines:

Is it true Jeff Bridges occupation Lane Chandler and photographer ? (question makes no sense)
Judi Densch (typo in question)
What is the boiling point of pressure copper as 4703.0? (question and paraphrase completely incorrect -> 'At which pressure does copper have a boiling point of 4703.0?')
Who Sleepwalking succeeded in playing Sleepwalking? (paraphrase same as question, makes no sense)
Could you summarize Korea's history of this topic? (question and Wikidata query make no sense)
Which is {landscape of} of {Virgin of the rocks}, which has {birth city} is {Tzippori} ? (question and paraphrase contain template strings, also additional quotes (\"))
How many dimensions have a Captain America? (question makes no sense)
What is the {neighborhood} for {shares border with} of {Los Angeles} (no question)
What sister city was born in of Zakhar Oskotsky? (question and paraphrase make no sense -> 'What are sister cities of the birth place of Zakhar Oskotsky?')
What is the musical score by Missa Solemnis that has mother Maria Magdalena van Beethoven? (question and paraphrase make no sense -> 'Which child of Maria Magdalena van Beethoven wrote the score Missa Solemnis?')
When did Robert De Nirolive in Marbletown? (typo in paraphrase)

The link of Unique Templates is broken

Unique Templates

Hi, is there a paper that describes data ?

Hi, this is a interesting work.
I want to know the paper or document which describes the data.
Would you help me ?

Thank you

yawei

Broken Data Characteristics pages from website

Hi there,
I highly appreciate the impressive work you guys did !
I wanted to get some information regarding the data characteristics - Unique Templates, Entities Covered and Predicates covered from the section given in the website. But unfortunately the links turned out to be broken. It would really helpful to have such overview of your data set.

Also, Is the code used to generate the dataset (especially the NNQT question) going to be released anytime soon ?

Thanks

asknowqa / lc-quad2.0 Goto Github PK

lc-quad2.0's Issues

Request for Templates and Reproduction Code for LC_QuAD 2.0 Dataset

Could you also provide answers?

Questions about DBpedia adopted

Collection of some problems in the first 600 lines of the training dataset

The link of Unique Templates is broken

Hi, is there a paper that describes data ?

Broken Data Characteristics pages from website

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent