tbennun / keras-bucketed-sequence Goto Github PK
View Code? Open in Web Editor NEWKeras dataset reader (Sequence) using buckets for RNNs
License: BSD 3-Clause "New" or "Revised" License
Keras dataset reader (Sequence) using buckets for RNNs
License: BSD 3-Clause "New" or "Revised" License
Hi,
I am wondering is there a way to use this across multiple LSTMs fed into one model in a Hierarchical Attention Network style as implemented in this blog post? https://richliao.github.io/supervised/classification/2016/12/26/textclassifier-HATN/
This currently works for just one LSTM layer by passing in shape=(None, word_vector_dimension) but how to make it work for both the word-level LSTM and the sentence-level LSTM? The Hierarchical Attention Network uses one LSTM at word level to encode features of words in a sentence using attention to determine which words in the sentence are important, then again another attention layer at the document level to determine which sentences are important out of all sentences in a document. I don't currently know how to get your code to work for both levels because when I try to use shape=(None, None) for the input of the "review" level (comparing multiple sentences in one document), I get
AsTensorError: ('Cannot convert (-1, None) to TensorType', <class 'tuple'>)
For reference here is my current code:
sentence_input= Input(shape=(None, 300))
l_lstm = Bidirectional(GRU(100, return_sequences=True))(sentence_input)
l_dense = TimeDistributed(Dense(200))(l_lstm)
l_att = AttLayer()(l_dense)
sentEncoder = Model(sentence_input, l_att)
#review_input = Input(shape=(MAX_SENTS,MAX_SENT_LENGTH), dtype='int32')
review_input = Input(shape=(7,None), dtype='int32')
review_encoder = TimeDistributed(sentEncoder)(review_input)
l_lstm_sent = Bidirectional(GRU(100, return_sequences=True))(review_encoder)
l_dense_sent = TimeDistributed(Dense(200))(l_lstm_sent)
l_att_sent = AttLayer()(l_dense_sent)
preds = Dense(2, activation='softmax')(l_att_sent)
model = Model(review_input, preds)
With some slight modifications in your code I got bucketed-sequence to run. However, when I try to do model.predict, the number of arrays (length) in my prediction output is the number of words in my input string. Does this mean the model is training on stateful input (1 timestep at a time IIRC) , or does it mean that my input to the prediction is automatically being converted to multiple inputs to the model? I need my RNN to predict with a sequence of words, not each word by itself.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.