Code Monkey home page Code Monkey logo

autokg's Introduction

AutoKG

Awesome License: MIT

Code and Data for the paper "LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities"

🌄Overview

Overview

The overview of our work. There are three main components: 1) Basic Evaluation: detailing our assessment of large models (text-davinci-003, ChatGPT, and GPT-4), in both zero-shot and one-shot settings, using performance data from fully supervised state-of-the-art models as benchmarks; 2) Virtual Knowledge Extraction: an examination of large models' virtual knowledge capabilities on the constructed VINE dataset; and 3) Automatic KG: the proposal of utilizing multiple agents to facilitate the construction and reasoning of KGs.

🌟 Evaluation

Data Preprocess

The datasets that we used in our experiments are as follows:

  • KG Construction

    You can download the dataset from the above address, and you can also find the data used in this experiment directly from the corresponding "datas" folder like DuIE2.0.

  • KG Reasoning

  • Question Answering

    • FreebaseQA
    • MetaQA

The expected structure of files is:

AutoKG
 |-- KG Construction
 |    |-- DuIE2.0
 |    |    |-- datas                    #dataset
 |    |    |-- prompts                  #0-shot/1-shot prompts
 |    |    |-- duie_processor.py        #preprocess data
 |    |    |-- duie_prompts.py          #generate prompts
 |	  |--MAVEN
 |    |    |-- datas                    #dataset
 |    |    |-- prompts                  #0-shot/1-shot prompts
 |    |    |-- maven_processor.py       #preprocess data
 |    |    |-- maven_prompts.py         #generate prompts
 |    |--RE-TACRED
 |    |    |-- datas                    #dataset
 |    |    |-- prompts                  #0-shot/1-shot prompts
 |    |    |-- retacred_processor.py    #preprocess data
 |    |    |-- retacred_prompts.py      #generate prompts
 |    |--SciERC
 |    |    |-- datas                    #dataset
 |    |    |-- prompts                  #0-shot/1-shot prompts
 |    |    |-- scierc_processor.py      #preprocess data
 |    |    |-- scierc_prompts.py        #generate prompts
 |-- KG Reasoning (Link Prediction)
 |    |-- FB15k-237
 |    |    |-- data                     #sample data
 |    |    |-- prompts                  #0-shot/1-shot prompts
 |    |-- ATOMIC2020
 |    |    |-- data                     #sample data
 |    |    |-- prompts                  #0-shot/1-shot prompts
 |    |    |-- system_eval              #eval for ATOMIC2020
 

How to Run

  • KG Construction(Use DuIE2.0 as an example)

    cd KG Construction
    python duie_processor.py 
    python duie_prompts.py

    Then we’ll get 0-shot/1-shot prompts in the folder prompts

  • KG Reasoning

  • Question Answering

🕵️Virtual Knowledge Extraction

The VINE dataset we built is available here.

Do the following code to generate prompts:

cd Virtual Knowledge Extraction
python VINE_processor.py
python VINE_prompts.py

🤖AutoKG

Our AutoKG code is based on CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society and a LangChain implementation of the paper, you can get more details through this link.

  • Change the OPENAI_API_KEY in Autokg.py
  • Change the SERPAPI_API_KEY in RE_CAMEL.py .( You can get more information in serpapi )

Run the Autokg.py script.

cd AutoKG
python Autokg.py

Citation

If you use the code or data, please cite the following paper:

@article{zhu2023llms,
  title={LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities},
  author={Zhu, Yuqi and Wang, Xiaohan and Chen, Jing and Qiao, Shuofei and Ou, Yixin and Yao, Yunzhi and Deng, Shumin and Chen, Huajun and Zhang, Ningyu},
  journal={arXiv preprint arXiv:2305.13168},
  year={2023}
}

autokg's People

Contributors

lesilez avatar litmirror123 avatar wangxh-07 avatar zxlzr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

autokg's Issues

Evaluation Process for Event Extraction from MAVEN Dataset

Hello,

I have a question regarding your evaluation process for Event extraction from the MAVEN dataset. I was unable to locate any code detailing your approach.

For instance, in your one-shot prompt:

Given a sentence: "Unprepared for the attack, the Swedish attempted to save their ships by cutting their anchor ropes and to flee."
Event types: Removing, Rescuing, Escaping, Attack, Self-motion

You are seeking event types as listed above. My question pertains to how you calculate the F1 score for a LLM when some data samples contain one event per sentence while others contain multiple events within a sentence. Could you please elucidate how you calculate precision and recall for each sample and overall for a model?

Thank you.

How to regenerate results for link prediction on FB15K-237?

Hi there,

Amazing work! Was wondering if you can provide some code files or jupyter notebooks to regenerate the results for the link prediction task on FB15K-237? I feed your provided prompts into chatgpt but couldn't regenerate your results given the output it provides.

Thank you,

Ryan

知识图谱构建效果

运行了autokg代码 如果我有一个文档(包含很多段文字) 构建对应知识图谱 (不需要外部信息)如何实现? 我运行的结果是几段对话 而并不是三元组

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.