Code Monkey home page Code Monkey logo

leven's Introduction

LEVEN

Dataset and source code for ACL 2022 Findings paper "LEVEN: A Large-Scale Chinese Legal Event Detection Dataset" .

Background

Events are the essence of the facts in legal cases. Therefore, Legal Event Detection (LED) is fundamentally important and naturally beneficial to case understanding and other Legal AI tasks.

bg

Overview

The dataset can be obtained from Tsinghua Cloud or Google Drive. The annotation guidelines are provided in Annotation Guidelines. You can also check out our poster at ACL2022 main conference.

Large Scale

LEVEN is the largest Legal Event Detection dataset and the largest Chinese Event Detection dataset. Here is a comparison between the scale of LEVEN and other datasets.

tab1

Datasets denoted with * are not publicly available, and – means the value is not accessible

High Coverage

LEVEN contains 108 event types in total, including 64 charge-oriented events and 44 general events. Their distribution is shown below.

tab2

The LEVEN event schema has a sophisticated hierarchical structure, which is shown here.

Leader Board

LEVEN is going to appear at CAIL 2022. To get the test results, you can submit your predictions to CAIL (the specific submission entry is coming soon).

Experiments

The source codes for the experiments are included in the Baselines and Downstreams folder.

​ The Baselines folder includes DMCNN, BiLSTM, BiLSTM+CRF, BERT, BERT+CRF and DMBERT.

​ The Downstreams folder includes Legal Judgment Prediction and Similar Case Retrieval.

Baselines

We implement six competitive Baselines and their performances are as follows.

tab3

Downstream Tasks

We also explore the use of LEVEN on two Downstreams. We simply use event as side information to promote the performance of Legal Judgment Prediction and Similar Case Retrieval.

The experiment results for Legal Judgment Prediction are shown below.

tab4

The experiment results for Similar Case Retrieval are shown below.

tab5

Schema

The Chinese event schema is shown below. Please check our paper for the English version.

The detailed explanation and annotation guidelines are provided in Annotation Guidelines.

schema

Citation

If these data and codes help you, please cite this paper.

@inproceedings{yao-etal-2022-leven,
    title = "{LEVEN}: A Large-Scale {C}hinese Legal Event Detection Dataset",
    author = "Yao, Feng and Xiao, Chaojun and Wang, Xiaozhi and Liu, Zhiyuan and Hou, Lei and Tu, Cunchao and Li, Juanzi and Liu, Yun and Shen, Weixing and Sun, Maosong",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    year = "2022",
    url = "https://aclanthology.org/2022.findings-acl.17",
    doi = "10.18653/v1/2022.findings-acl.17",
    pages = "183--201",
}

leven's People

Contributors

yaof20 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.