Code Monkey home page Code Monkey logo

alstm's Introduction

Adaptive LSTM (aLSTM)

PyTorch implementation of the adaptive LSTM (https://arxiv.org/abs/1805.08574), an extension of the standard LSTM that increases model flexibility through adaptive parameterization.

The aLSTM converges faster than the LSTM with superior generalizing performance. It is also stable; no need to use gradient clipping, even for sequences of up to thousands of terms. For more info, see the paper or the informal write up.

If you use this code or our results in your research, please cite

@article{Flennerhag:2018alstm,
  title   = {{Breaking the Activation Function Bottleneck through Adaptive Parameterization}},
  author  = {Flennerhag, Sebastian and Hujun, Yin and Keane, John and Elliot, Mark},
  journal = {{arXiv preprint arXiv:1805.08574}},
  year    = {2018}
}

Requirements

This implementation should run on any PyTorch version. It has been tested for v2โ€“v4. To install:

git clone https://github.com/flennerhag/alstm; cd alstm
python setup.py install

Usage

This implementation follows the LSTM implementation in the official (and constantly changing) PyTorch repo. You have an alstm_cell function and its aLSTMCell module wrapper. These apply to a given time step. The aLSTM class provides an end-user API with variational dropout and our hybrid RHN-LSTM adaptation model for multi-layer aLSTMs.

import torch
from torch.autograd import Variable
from alstm import aLSTM

seq_len, batch_size, input_size, hidden_size, adapt_size, output_size, = 20, 5, 8, 10, 3, 7

alstm = aLSTM(input_size, hidden_size, adapt_size, output_size, nlayers=2)

X = Variable(torch.rand(seq_len, batch_size, hidden_size))
out, hidden = alstm(X) 

Examples

To replicate the original experiments of the aLSTM paper see examples.

Contributions

If you spot a bug, think the docs are useless or have an idea for an extension, don't hesitate to send a PR! If your contribution is substantial, please raise an issue first to check that it is in line with the scope of this repo. Quick wins that would be great to have are:

  • Support for bidirectional aLSTM
  • Support PyTorch's PackedSequence

alstm's People

Contributors

flennerhag avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.