Code Monkey home page Code Monkey logo

summaformers's Introduction

Summaformers

Code for the Paper 'Summaformers @ LaySumm 20, LongSumm 20' at EMNLP 2020, Scholarly Document Processing Workshop

Abstract

Automatic text summarization has been widely studied as an important task in natural language processing. Traditionally, various feature engineering and machine learning based systems have been proposed for extractive as well as abstractive text summarization. Recently, deep learning based, specifically Transformer-based systems have been immensely popular. Summarization is a cognitively challenging task — extracting summary worthy sentences is laborious, and expressing semantics in brief when doing abstractive summarization is complicated. In this paper, we specifically look at the problem of summarizing scientific research papers from multiple domains. We differentiate between two types of summaries, namely, (a) LaySumm: A very short summary that captures the essence of the research paper in layman terms restricting overtly specific technical jargon and (b) LongSumm: A much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper. While leveraging latest Transformer-based models, our systems are simple, intuitive and based on how specific paper sections contribute to human summaries of the two types described above. Evaluations against gold standard summaries using ROUGE metrics prove the effectiveness of our approach. On blind test corpora, our system ranks first and third for the LongSumm and LaySumm tasks respectively.

BibTeX to cite our work

@inproceedings{ghosh-roy-etal-2020-summaformers,
    title = "Summaformers @ {L}ay{S}umm 20, {L}ong{S}umm 20",
    author = "Ghosh Roy, Sayar  and
      Pinnaparaju, Nikhil  and
      Jain, Risubh  and
      Gupta, Manish  and
      Varma, Vasudeva",
    booktitle = "Proceedings of the First Workshop on Scholarly Document Processing",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.sdp-1.39",
    doi = "10.18653/v1/2020.sdp-1.39",
    pages = "336--343"
}

summaformers's People

Contributors

risubaba avatar sayarghoshroy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

summaformers's Issues

Some question about the model

Dear author, your work is outstanding! I read your paper carefully, and I would like to consult some details about some questions.
In the LongSumm task, how does the budget module of your model calculate the weights of summary sentences obtained in different sections?
Here is the statement extracted from your paper:
" The best performing setting corresponds to selecting sections whose ROUGE-1 overlap with the long summary
is greater than 20.0. Intuitively, this prunes out irrelevant sections such as ‘abbreviations’ and ‘acknowledgements’. The remaining sections were assigned weights based on the ROUGE-1 overlap with the provided long summary. "

I don't understand that meaning. When you test your model and there is no Longsum as target, how do you compute the rouge-1 between different sections with the target?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.