Code Monkey home page Code Monkey logo

master-thesis's Introduction

Master's Thesis

Text Generation with BERT Embeddings, WordNet senses and Frame Embeddings built on FrameNet graph.

The objective of this thesis is to blend state-of-the-art neural architectures with the still scarcely exploited potential of symbolic knowledge bases, to contribute tackling one of the biggest open problems in artificial intelligence in a specific subfield of natural language processing (NLP): commonsense reasoning in text generation.

Given a set of concepts (expressed by nouns and verbs), the goal is to generate a short sentence that acts as a description of a scene, plausible according to human commonsense knowledge. This problem can be seen as a special case of constrained text generation, with two major challenges: the generation of sentences given an unordered set of keywords with potential morphological changes, and the comprehension of commonsense relations between sets of concepts, finding an appropriate composition.

Since most recent approaches to the problem show no interest in the use of symbolic knowledge resources, this work intends to take the best of both worlds (neural/symbolic), lying at their intersection. The recently proven capabilities of transformer models are leveraged in com- bination with word sense disambiguation and frame embeddings extracted from FrameNet, an English language knowledge base built upon the theory of Frame Semantics.

The evaluation has been conducted on Commongen, a dataset suitably built for this purpose. A long pre-processing phase, including disambiguation of nominal entities, has been conducted, before training the model for three main scenarios: raw text (as baseline); disambiguated text; disambiguated text with frame embeddings.

The approach has been evaluated on a manually built test-set from the post-processed initial dataset, by means of BLEU and ROUGE metrics. The proposed approach leads to an increase in performance with respect to the baseline, achieving promising results on both automatic metrics, and suggesting further steps to refine our methodology.

Link to AMSLaurea

master-thesis's People

Contributors

edivadiranatnom avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.