Code Monkey home page Code Monkey logo

textclassification's People

Contributors

zackhy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

textclassification's Issues

Pre-trained Embedding

Thanks for sharing this code. Could you please add the feature of using pre-trained word embedding in Embedding layer of the network. Having both options, random initialization and pre-trained word embedding would be nice.
Regards

Print predicit

Congratulations on the code helped a lot, how do I print what was the prediction made for the texts.

Regards

你好,请问如何预测单个句子?

@zackhy 作者您好:
很易用的库。我几分钟就用起来了,同时用的是自己的数据,目前准确率也不错!!赞一个
但是您的 test.py是一个测试一组数据并给出准确率的方法。
请问如何写一个最小的预测的python脚本?
例如 : python predict.py '需要预测分类的句子' ,输出是 分类标签例如 : 2
刚刚接触tensorflow不久,望您指导下,谢谢!

evaluation

training works fine how to evaluate the trained model

Using Bi-LSTM with clstm model

Thank you for your effort. Please, I want to use Bi-LSTM with clstm model. But when I use it, the following error raised
`Traceback (most recent call last):
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1292, in _do_call
return fn(*args)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 219, in
run_step(train_input, is_training=True)
File "train.py", line 198, in run_step
vars = sess.run(fetches, feed_dict)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 887, in run
run_metadata_ptr)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1110, in _run
feed_dict_tensor, options, run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1286, in _do_run
run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1308, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]

Caused by op 'bidirectional_rnn/bw/ReverseSequence', defined at:
File "train.py", line 138, in
classifier = clstm_clf(FLAGS)
File "C:\Users\Saja\Desktop\TextClassification-master\TextClassification-master\clstm_classifier.py", line 133, in init
sequence_length=self.sequence_length)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 466, in bidirectional_dynamic_rnn
inputs_reverse = nest.map_structure(_map_reverse, inputs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in map_structure
structure[0], [func(*x) for x in entries])
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in
structure[0], [func(*x) for x in entries])
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 464, in _map_reverse
batch_axis=batch_axis)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 453, in _reverse
seq_axis=seq_axis, batch_axis=batch_axis)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2645, in reverse_sequence
name=name)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 7984, in reverse_sequence
seq_dim=seq_dim, batch_dim=batch_dim, name=name)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 3272, in create_op
op_def=op_def)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 1768, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]`

I took the implementation of Bi-LSTM from your code in rnn_classifier model:

`fw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)
bw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)
# Add dropout to LSTM cell
fw_cell = tf.contrib.rnn.DropoutWrapper(fw_cell, output_keep_prob=self.keep_prob)
bw_cell = tf.contrib.rnn.DropoutWrapper(bw_cell, output_keep_prob=self.keep_prob)
# Stacked LSTMs
fw_cell = tf.contrib.rnn.MultiRNNCell([fw_cell]*self.num_layers, state_is_tuple=True)
bw_cell = tf.contrib.rnn.MultiRNNCell([bw_cell]*self.num_layers, state_is_tuple=True)

    self._initial_state_fw = fw_cell.zero_state(self.batch_size, dtype=tf.float32)
    self._initial_state_bw = bw_cell.zero_state(self.batch_size, dtype=tf.float32)
    with tf.name_scope('dynamic_rnn'):
        
        outputs, state, _ = tf.nn.static_bidirectional_rnn(
            fw_cell, 
            bw_cell,
            tf.unstack(tf.transpose(rnn_inputs, perm=[1, 0, 2])),
            initial_state_fw=self._initial_state_fw,
            initial_state_bw=self._initial_state_bw,
            sequence_length=self.sequence_length,
            #dtype=tf.float32,
            scope='BiLSTM'
            )
        #outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])
        self.outputs = outputs
    
    out, state = tf.nn.bidirectional_dynamic_rnn(fw_cell,
                                                   bw_cell,
                                                   inputs=rnn_inputs,
                                                   initial_state_fw=self._initial_state_fw,
                                                   initial_state_bw=self._initial_state_bw,
                                                   sequence_length=self.sequence_length)

    state_fw = state[0]
    state_bw = state[1]
    output = tf.concat([state_fw[self.num_layers - 1].h, state_bw[self.num_layers - 1].h], 1)
    
    self.final_state=output
    
    # Softmax output layer
    with tf.name_scope('softmax'):

        softmax_w = tf.get_variable('softmax_w', shape=[2 * self.hidden_size, self.num_classes], dtype=tf.float32)
        softmax_b = tf.get_variable('softmax_b', shape=[self.num_classes], dtype=tf.float32)

        # L2 regularization for output layer
        self.l2_loss += tf.nn.l2_loss(softmax_w)
        self.l2_loss += tf.nn.l2_loss(softmax_b)

        # logits
        self.logits = tf.matmul(self.final_state, softmax_w) + softmax_b
        predictions = tf.nn.softmax(self.logits)
        self.predictions = tf.argmax(predictions, 1, name='predictions')`

SaveModel implementation

Hi,

I did tried to implement SavedModel in test.py, couldn't fix the issue.

sess = tf.Session()

# folder to export SavedModel
SavedModel_folder = "SavedModel"

# remove all explicit device specifications
clear_devices = True

# builds the SavedModel protocol buffer and saves variables and assets
builder = tf.saved_model.builder.SavedModelBuilder(SavedModel_folder)

# Restore metagraph
saver = tf.train.import_meta_graph('{}.meta'.format(os.path.join(FLAGS.run_dir, 'model', FLAGS.checkpoint)))
# Restore weights
saver.restore(sess, os.path.join(FLAGS.run_dir, 'model', FLAGS.checkpoint))

# Get tensors
input_x = graph.get_tensor_by_name('input_x:0')
input_y = graph.get_tensor_by_name('input_y:0')
keep_prob = graph.get_tensor_by_name('keep_prob:0')
predictions = graph.get_tensor_by_name('softmax/predictions:0')
accuracy = graph.get_tensor_by_name('accuracy/accuracy:0')

legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')

prediction_signature = tf.saved_model.signature_def_utils.build_signature_def(
        inputs={'input_x:0': input_x, "input_y:0": input_y})

signature_def_map ={"serving_default": prediction_signature}

builder.add_meta_graph_and_variables(sess=sess,
    tags=[tf.saved_model.tag_constants.SERVING],
    signature_def_map=signature_def_map, legacy_init_op=legacy_init_op)

# writes a SavedModel protocol buffer to disk
builder.save()

"TypeError: Parameter to CopyFrom() must be instance of same class: expected tensorflow.TensorInfo got Tensor.
"

Can you fix/add a file to export to SaveModel?

Type Error

Type Error:flag value must be string,found "<class,'int'>"
hELLO @zackhy WHEN i RUN test.py its give me this error

invalid file:<absl.flags._flag.Flag object at 0x0000000011E2E898>

Hello @zackhy
when I run following command
python test.py --test_data_file=./data/data.csv --run_dir=./runs/1111111111 --clf=clf-10000
I got this error->unrecognizedflagerror:Unknown command line flag 'clf'
SO I just add following line
tf.flags.DEFINE_string('clf', 'cnn', "Type of classifiers. Default: cnn. You have four choices: [cnn, lstm, blstm, clstm]") in test.py

Change input into a list of coordinate

I am interested in using this code but would like to use a list of coordinates (or a list of lists) as an input for the LSTM and the Bi-LSTM.
What is the best way to do so?
How should the flags (such as vocabulary size and minimum frequency) be changed?

Thanks

Train/test examples in README.md

Both the training and testing command examples in README.md use the same ./data/data.csv. It seems misleading for me because the train data should be different from the test data.
How about using 2 different csv files:
python train.py --data_file=./data/data.train.csv
and
python test.py --test_data_file=./data/data.test.csv

Multi-label classification

Thank you for your effort. But I want to ask you how can I modify this code to be suited with multilabel class classification?

Thanks,

performance benchmarks

Hi zackhy,

May I ask if there are accuracy benchmarks of this project on some datasets like SST2, MR?

I'm asking because I tried running this project on SST2 and got performance not very good. Specifically, by training with the following config, I get accuracy on test set with around 0.7, however I guess the performance of cnn model reported by the original paper is above 0.8

python3 train.py \
--data_file=./data/sst2/stsa.binary.train.csv \
--clf=cnn \
--embedding_size=300 \
--num_filters=100 \
--learning_rate 0.0005 \
--batch_size=50 \
--num_epochs=500 \
--evaluate_every_steps=10 \
--save_every_steps=200 \
--num_checkpoint=99999

ACCURACY

may I ask you what is the accuracy of your implementation ,please

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.