Code Monkey home page Code Monkey logo

factcheck-gpt's Introduction

Factcheck-GPT

Fact-checking the Output of Generative Large Language Models in both Annotation and Evaluation.

Code License: Apache 2.0

Table of Contents

Pipeline

This work define a framework for document-level fact checking, from decomposation, decontextualisation, check-worthiness identification to evidence retrieval, stance detection and make edits to fix the hallucinated information.

Title

Get Started

Steps to check factuality of a document. See details in reproduce_tutorial.ipynb for subtasks.

import sys
sys.path.append("./src")
import pandas as pd
from pipeline import check_document, check_documents

doc = "MBZUAI ranks 19 globally in its areas of specialization – AI in 2023."  # document to check
label, log = check_document(doc, model = "gpt-3.5-turbo-0613")

Dataset

We construct a dataset --- Factcheck-GPT with human-annotated factual labels for 94 ChatGPT responses.

Statistics

Statitics over three question sources as below:

Title

Claim analysis:

  • Whether raters can de�termine the factuality of a claim depending on the automatically-collected evidence (Yes/No).

  • Does the evidence support the claim (CP:completely support, PS:partially support, RE: refute, IR: irrelevant)

  • Does the claim need to be corrected. NA (17) refers to 16 opinion-claims + 1 not-a-claim

Title

Annotation Tool

We release the annotation tool in this repository.

FactBench

We further gather another three human-annotated datasets (FacTool-KB, FELM-WK, HaluEval) that are used to evaluate the effectiveness of automatic fact-checkers, resulting in FactBench with 4,835 examples.

Baselines

We set five subtasks and show the subtask 4: verfication below. See details in notebook: reproduce_tutorial.

Subtasks

  • Subtask 1. Sentence Checkworthyness: Given a sentence, identify whether the sentence contains factual statement (Yes/No).

  • Subtask 2. Claim Checkworthyness: Given a claim, detect its checkworthiness by categories of factual, opinion, not a claim and other.

  • Subtask 3. Stance Detection: Given a (evidence passage, claim) pair, judge the stance of evidence against the claim, whether it supports, partially supports, refute or is irrelevant to the claim.

  • Subtask 4. Claim Verification: Given a claim without gold evidence, determine whether it is factually true or false, if false, revise.

  • Subtask 5. Edit False Response: Given a list of true claims and the original response, eidt to correct the factual errors while preserving the linguistic features and style of the original.

Verification Results

Title

Citation

The Factcheck-GPT is described in the following arXiv paper:

@article{Wang2023FactcheckGPTEF,
  title={Factcheck-GPT: End-to-End Fine-Grained Document-Level 
         Fact-Checking and Correction of LLM Output},
  author={Yuxia Wang and 
          Revanth Gangi Reddy and 
          Zain Muhammad Mujahid and 
          Arnav Arora and 
          Aleksandr Rubashevskii and 
          Jiahui Geng and 
          Osama Mohammed Afzal and 
          Liangming Pan and 
          Nadav Borenstein and 
          Aditya Pillai and 
          Isabelle Augenstein and 
          Iryna Gurevych and 
          Preslav Nakov},
  journal={ArXiv},
  year={2023},
  volume={abs/2311.09000},
}

factcheck-gpt's People

Contributors

yuxiaw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.