zackhy / textclassification Goto Github PK
View Code? Open in Web Editor NEWText classification using different neural networks (CNN, LSTM, Bi-LSTM, C-LSTM).
License: MIT License
Text classification using different neural networks (CNN, LSTM, Bi-LSTM, C-LSTM).
License: MIT License
Thanks for sharing this code. Could you please add the feature of using pre-trained word embedding in Embedding layer of the network. Having both options, random initialization and pre-trained word embedding would be nice.
Regards
Congratulations on the code helped a lot, how do I print what was the prediction made for the texts.
Regards
@zackhy 作者您好:
很易用的库。我几分钟就用起来了,同时用的是自己的数据,目前准确率也不错!!赞一个
但是您的 test.py是一个测试一组数据并给出准确率的方法。
请问如何写一个最小的预测的python脚本?
例如 : python predict.py '需要预测分类的句子' ,输出是 分类标签例如 : 2
刚刚接触tensorflow不久,望您指导下,谢谢!
training works fine how to evaluate the trained model
Thank you for your effort. Please, I want to use Bi-LSTM with clstm model. But when I use it, the following error raised
`Traceback (most recent call last):
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1292, in _do_call
return fn(*args)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 219, in
run_step(train_input, is_training=True)
File "train.py", line 198, in run_step
vars = sess.run(fetches, feed_dict)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 887, in run
run_metadata_ptr)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1110, in _run
feed_dict_tensor, options, run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1286, in _do_run
run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1308, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]
Caused by op 'bidirectional_rnn/bw/ReverseSequence', defined at:
File "train.py", line 138, in
classifier = clstm_clf(FLAGS)
File "C:\Users\Saja\Desktop\TextClassification-master\TextClassification-master\clstm_classifier.py", line 133, in init
sequence_length=self.sequence_length)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 466, in bidirectional_dynamic_rnn
inputs_reverse = nest.map_structure(_map_reverse, inputs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in map_structure
structure[0], [func(*x) for x in entries])
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in
structure[0], [func(*x) for x in entries])
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 464, in _map_reverse
batch_axis=batch_axis)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 453, in _reverse
seq_axis=seq_axis, batch_axis=batch_axis)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2645, in reverse_sequence
name=name)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 7984, in reverse_sequence
seq_dim=seq_dim, batch_dim=batch_dim, name=name)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 3272, in create_op
op_def=op_def)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 1768, in init
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]`
I took the implementation of Bi-LSTM from your code in rnn_classifier model:
`fw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)
bw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)
# Add dropout to LSTM cell
fw_cell = tf.contrib.rnn.DropoutWrapper(fw_cell, output_keep_prob=self.keep_prob)
bw_cell = tf.contrib.rnn.DropoutWrapper(bw_cell, output_keep_prob=self.keep_prob)
# Stacked LSTMs
fw_cell = tf.contrib.rnn.MultiRNNCell([fw_cell]*self.num_layers, state_is_tuple=True)
bw_cell = tf.contrib.rnn.MultiRNNCell([bw_cell]*self.num_layers, state_is_tuple=True)
self._initial_state_fw = fw_cell.zero_state(self.batch_size, dtype=tf.float32)
self._initial_state_bw = bw_cell.zero_state(self.batch_size, dtype=tf.float32)
with tf.name_scope('dynamic_rnn'):
outputs, state, _ = tf.nn.static_bidirectional_rnn(
fw_cell,
bw_cell,
tf.unstack(tf.transpose(rnn_inputs, perm=[1, 0, 2])),
initial_state_fw=self._initial_state_fw,
initial_state_bw=self._initial_state_bw,
sequence_length=self.sequence_length,
#dtype=tf.float32,
scope='BiLSTM'
)
#outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])
self.outputs = outputs
out, state = tf.nn.bidirectional_dynamic_rnn(fw_cell,
bw_cell,
inputs=rnn_inputs,
initial_state_fw=self._initial_state_fw,
initial_state_bw=self._initial_state_bw,
sequence_length=self.sequence_length)
state_fw = state[0]
state_bw = state[1]
output = tf.concat([state_fw[self.num_layers - 1].h, state_bw[self.num_layers - 1].h], 1)
self.final_state=output
# Softmax output layer
with tf.name_scope('softmax'):
softmax_w = tf.get_variable('softmax_w', shape=[2 * self.hidden_size, self.num_classes], dtype=tf.float32)
softmax_b = tf.get_variable('softmax_b', shape=[self.num_classes], dtype=tf.float32)
# L2 regularization for output layer
self.l2_loss += tf.nn.l2_loss(softmax_w)
self.l2_loss += tf.nn.l2_loss(softmax_b)
# logits
self.logits = tf.matmul(self.final_state, softmax_w) + softmax_b
predictions = tf.nn.softmax(self.logits)
self.predictions = tf.argmax(predictions, 1, name='predictions')`
Hi,
I did tried to implement SavedModel in test.py, couldn't fix the issue.
sess = tf.Session()
# folder to export SavedModel
SavedModel_folder = "SavedModel"
# remove all explicit device specifications
clear_devices = True
# builds the SavedModel protocol buffer and saves variables and assets
builder = tf.saved_model.builder.SavedModelBuilder(SavedModel_folder)
# Restore metagraph
saver = tf.train.import_meta_graph('{}.meta'.format(os.path.join(FLAGS.run_dir, 'model', FLAGS.checkpoint)))
# Restore weights
saver.restore(sess, os.path.join(FLAGS.run_dir, 'model', FLAGS.checkpoint))
# Get tensors
input_x = graph.get_tensor_by_name('input_x:0')
input_y = graph.get_tensor_by_name('input_y:0')
keep_prob = graph.get_tensor_by_name('keep_prob:0')
predictions = graph.get_tensor_by_name('softmax/predictions:0')
accuracy = graph.get_tensor_by_name('accuracy/accuracy:0')
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
prediction_signature = tf.saved_model.signature_def_utils.build_signature_def(
inputs={'input_x:0': input_x, "input_y:0": input_y})
signature_def_map ={"serving_default": prediction_signature}
builder.add_meta_graph_and_variables(sess=sess,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map=signature_def_map, legacy_init_op=legacy_init_op)
# writes a SavedModel protocol buffer to disk
builder.save()
"TypeError: Parameter to CopyFrom() must be instance of same class: expected tensorflow.TensorInfo got Tensor.
"
Can you fix/add a file to export to SaveModel?
Type Error:flag value must be string,found "<class,'int'>"
hELLO @zackhy WHEN i RUN test.py its give me this error
Hello @zackhy
when I run following command
python test.py --test_data_file=./data/data.csv --run_dir=./runs/1111111111 --clf=clf-10000
I got this error->unrecognizedflagerror:Unknown command line flag 'clf'
SO I just add following line
tf.flags.DEFINE_string('clf', 'cnn', "Type of classifiers. Default: cnn. You have four choices: [cnn, lstm, blstm, clstm]") in test.py
Do I need to change anything please? Thank you.
I am interested in using this code but would like to use a list of coordinates (or a list of lists) as an input for the LSTM and the Bi-LSTM.
What is the best way to do so?
How should the flags (such as vocabulary size and minimum frequency) be changed?
Thanks
Both the training and testing command examples in README.md use the same ./data/data.csv
. It seems misleading for me because the train data should be different from the test data.
How about using 2 different csv files:
python train.py --data_file=./data/data.train.csv
and
python test.py --test_data_file=./data/data.test.csv
Thank you for your effort. But I want to ask you how can I modify this code to be suited with multilabel class classification?
Thanks,
Line 90 in d893212
params = FLAGS.flag_values_dict()
Hi zackhy,
May I ask if there are accuracy benchmarks of this project on some datasets like SST2, MR?
I'm asking because I tried running this project on SST2 and got performance not very good. Specifically, by training with the following config, I get accuracy on test set with around 0.7, however I guess the performance of cnn model reported by the original paper is above 0.8
python3 train.py \
--data_file=./data/sst2/stsa.binary.train.csv \
--clf=cnn \
--embedding_size=300 \
--num_filters=100 \
--learning_rate 0.0005 \
--batch_size=50 \
--num_epochs=500 \
--evaluate_every_steps=10 \
--save_every_steps=200 \
--num_checkpoint=99999
hI
is there any installation guide? sth like tf version and os info?
thanks
may I ask you what is the accuracy of your implementation ,please
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.