Light

ucas010 / legalqa Goto Github PK

View Code? Open in Web Editor NEW

This project forked from siatnlp/legalqa

0.0 0.0 0.0 11.71 MB

A Chinese question answering dataset for legal advice.

legalqa's Introduction

LegalQA

A Chinese question answering dataset for legal advice.

Description

The corpus contains legal question answer pairs from Chinese online forums. The questions are raised by netizens and the answer are provided by licensed lawyers.
It contains 4 data fields: question subject, question body, answer and label.
The positive question answer pairs are the ground truth pair provided online. For all the questions, we select answers to other questions of the same category as negative answers.
We manually annotate part of the dataset to ensure correctness. Manually annotated subsets are named as ''LegalQA-manual-train.csv'', ''LegalQA-manual-dev.csv'' and ''LegalQA-manual-test.csv''.
For more QA pairs, please refer to the full dataset in LegalQA-all.zip which contains ''LegalQA-all-train.csv'', ''LegalQA-all-dev.csv'' and ''LegalQA-all-test.csv''.
Any further questions, contact as by raising issues.

Statistics

LegalQA-manual

sets	Train	Dev	Test
Number of questions	783	93	136
Number of answers	3121	602	865
Average length of questions	160	180	159
Average length of answers	41	45	43

LegalQA-all

sets	Train	Dev	Test
Number of questions	10526	1593	3035
Number of answers	21237	2866	6091
Average length of questions	160	173	168
Average length of answers	41	40	42

Details

Dataset(train/dev/test)	#Question	#QA Pairs	%Correct
LegalQA(manual)	783/93/136	7,258/816/1,169	21.8/23.3/23.9
LegalQA(all)	10,526/1,593/3,035	100,590/11,965/26,913	21.8/24.4/22.9

Experimental Results

Answer Selection

subset	MAP	MRR	P@1
LegalQA(manual)	0.8230	0.8749	0.7868
LegalQA(all)	0.8287	0.8867	0.8171

legalqa's People

Contributors

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.