Code Monkey home page Code Monkey logo

tf-nlp-blocks's Introduction

tf-nlp-blocks

Python: 3.6 Tensorflow: 1.6 License: MIT

Author: Han Xiao https://hanxiao.github.io

A collection of frequently-used deep learning blocks I have implemented in Tensorflow. It covers the core tasks in NLP such as embedding, encoding, matching and pooling. All implementations follow a modularized design pattern which I called the "block-design". More details can be found in my blog post.

Requirements

  • Python >= 3.6
  • Tensorflow >= 1.6

Contents

encode_blocks.py

A collection of sequence encoding blocks. Input is a sequence with shape of [B, L, D], output is another sequence in [B, L, D'], where B is batch size, L is the length of the sequence and D and D' are the dimensions.

Name Dependencies Description Reference
LSTM_encode a fast multi-layer bidirectional LSTM implementation based on CudnnLSTM. Expect to be 5~10x faster than the standard tf LSTMCell. However, it can only run on GPU. Tensorflow doc on CudnnLSTM
TCN_encode Res_DualCNN_encode a temporal convolution network described in the paper, basically a multi-layer dilated CNN with special padding to ensure the causality An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Res_DualCNN_encode CNN_encode a sub-block used by TCN_encode. It is a two-layer CNN with spatial dropout in-between, then followed by a residual connection and a layer-norm. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
CNN_encode a standard conv1d implementation on L axis, with the possibility to set different paddings Convolutional Neural Networks for Sentence Classification

match_blocks.py

A collection of sequence matching blocks, aka. attention. Input are two sequnces: context in the shape of [B, L_c, D], and query in the shape of [B, L_q, D]. The output is a sequence has the same length as context, i.e. with shape of [B, L_c, D]. Each position in the output should encodes the relevance of that position in context to the complete query.

Name Dependencies Description Reference
Attentive_match basic attention mechanism with different scoring functions, also supports future blinding. additive: Neural machine translation by jointly learning to align and translate; scaled: Attention is all you need
Transformer_match a multi-head attention block from "Attention is all you need" Attention is all you need
AttentiveCNN_match Attentive_match the light version of attentive convolution, with the possibility of future blinding to ensure causality. Attentive Convolution
BiDaf_match attention flow layer used in bidaf model. Bidirectional Attention Flow for Machine Comprehension

pool_blocks.py

A collection of pooling blocks. It fuses/reduces on the time axis L. Input is a sequence with shape of [B, L, D], output is in [B, D].

Name Dependencies Description Reference
SWEM_pool do pooling on the input sequence, supports max/avg. pooling, hierarchical avg. max pooling. Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

There are also some convolution-based pooling blocks build on SWEM_pool, but they are for experimental purpose. Thus, I will not list them here.

embed_blocks.py

A collection of positional encoding on the sequence.

Name Dependencies Description Reference
SinusPositional_embed generate a sinusoid signal that has the same length of the input sequence Attention is all you need
Positional_embed parameterize the absolute position of the tokens in the input sequence A Convolutional Encoder Model for Neural Machine Translation

mulitask_blocks.py

A collection of multi-task learning blocks. So far only the "cross-stitch block" is available.

Name Dependencies Description Reference
CrossStitch a cross-stitch block, modeling the correlation & self-correlation of two tasks Cross-stitch Networks for Multi-task Learning
Stack_CrossStitch CrossStitch stacking multiple cross-stitch blocks together with shared/separated input Cross-stitch Networks for Multi-task Learning

nn.py

A collection of auxiliary functions, e.g. masking, normalizing, slicing.

Run

Run app.py for a simple test on toy data.

tf-nlp-blocks's People

Contributors

hanxiao avatar

Watchers

Bater.Makhabel avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.