Code Monkey home page Code Monkey logo

lc-quad2.0's Issues

Request for Templates and Reproduction Code for LC_QuAD 2.0 Dataset

Hello @mohnish-rygbee,

I hope this message finds you well. My name is Breno W. Carvalho and I am an AI researcher currently working on a project related to LLMs (Language Models). I've been deeply inspired by the work you've done with the LC_QuAD 2.0 dataset, and I believe it offers invaluable insights and potential for further research in this domain.

As part of my research, I am aiming to create a new dataset that extends upon the work you've pioneered with LC_QuAD 2.0. To ensure the fidelity and rigor of my work, I was hoping to gain access to:

The templates used for generating the diverse set of SPARQL queries in LC_QuAD 2.0.
Any code or scripts that would help in reproducing the findings/results presented in your paper.

Having access to these resources would greatly aid in ensuring the accuracy and quality of my extended dataset. Of course, all due credits will be provided to your team and the LC_QuAD 2.0 dataset in any publications or presentations arising from my research.

I understand the concerns related to sharing proprietary or sensitive information. If there are any reservations or conditions, please let me know, and I'd be more than happy to accommodate or discuss further.

Thank you for considering my request. I truly appreciate the work you've put into the LC_QuAD 2.0 dataset and believe that together, we can further the boundaries of what's possible in the realm of AI research.

Warm regards

Questions about DBpedia adopted

Dear authors, in the paper entitled "LC-QuAD 2.0: A Large Dataset for Complex Question Answering over Wikidata and DBpedia", you mentioned that the knowledge graph adopted is DBpedia 2018, however, I didn't find this version of DBpedia in the homepage (https://wiki.dbpedia.org/develop/datasets), and the latest dump (DBpedia Dataset 2019-08-30 (Pre-Release)) is also suitable for LC-QuAD 2.0. Could you please tell me where to download DBpedia 2018? Is it the link (http://downloads.dbpedia.org/repo/lts/wikidata/) provided in your paper? I tried to import the turtle files into virtuoso, but it failed due to some errors in the files (mappingbased-properties-reified.ttl). Thanks a lot!

Collection of some problems in the first 600 lines of the training dataset

These are some problems with the training dataset that I found by skimming the first 600 lines:

  • Is it true Jeff Bridges occupation Lane Chandler and photographer ? (question makes no sense)
  • Judi Densch (typo in question)
  • What is the boiling point of pressure copper as 4703.0? (question and paraphrase completely incorrect -> 'At which pressure does copper have a boiling point of 4703.0?')
  • Who Sleepwalking succeeded in playing Sleepwalking? (paraphrase same as question, makes no sense)
  • Could you summarize Korea's history of this topic? (question and Wikidata query make no sense)
  • Which is {landscape of} of {Virgin of the rocks}, which has {birth city} is {Tzippori} ? (question and paraphrase contain template strings, also additional quotes (\"))
  • How many dimensions have a Captain America? (question makes no sense)
  • What is the {neighborhood} for {shares border with} of {Los Angeles} (no question)
  • What sister city was born in of Zakhar Oskotsky? (question and paraphrase make no sense -> 'What are sister cities of the birth place of Zakhar Oskotsky?')
  • What is the musical score by Missa Solemnis that has mother Maria Magdalena van Beethoven? (question and paraphrase make no sense -> 'Which child of Maria Magdalena van Beethoven wrote the score Missa Solemnis?')
  • When did Robert De Nirolive in Marbletown? (typo in paraphrase)

Broken Data Characteristics pages from website

Hi there,
I highly appreciate the impressive work you guys did !
I wanted to get some information regarding the data characteristics - Unique Templates, Entities Covered and Predicates covered from the section given in the website. But unfortunately the links turned out to be broken. It would really helpful to have such overview of your data set.

Also, Is the code used to generate the dataset (especially the NNQT question) going to be released anytime soon ?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.