Code Monkey home page Code Monkey logo

automated-essay-grading's Introduction

Automated Essay Grading

Source code for the paper A Memory-Augmented Neural Model for Automated Grading in L@S 2017. Note that recent check-in updates the python from python 2.5 to python 3.7. Model Structure

The dataset comes from Kaggle ASAP competition. You can download the data from the link below.

https://www.kaggle.com/c/asap-aes/data

Glove embeddings are used in this work. Specifically, 42B 300d is used to get the best results. You can download the embeddings from the link below.

https://nlp.stanford.edu/projects/glove/

Get Started

git clone https://github.com/siyuanzhao/automated-essay-grading.git
  • Download training data file 'training_set_rel3.tsv' from Kaggle and put it under the root folder of this repo.

  • Download 'glove.42B.300d.zip' from https://nlp.stanford.edu/projects/glove/ and unzip all files into 'glove/' folder.

Requirements

  • Tensorflow 1.10
  • scikit-learn 0.19
  • six 1.10.0
  • python 3.7

Usage

# Train the model on an essay set <essay_set_id>
python cv_train.py --essay_set_id <eassy_set_id>

There are serval flags within cv_train.py. Below is an example of training the model on essay set 1 with specific learning rate, and epochs.

python cv_train.py --essay_set_id 1 --learning_rate 0.005 --epochs 200

Check all avaiable flags with the following command.

python cv_train.py -h

Note: The model is trained on the training data with 5-fold cross validation. By default, the output layer of the model is a classification layer. There is another model whose output layer is a regression layer in memn2n_kv_regression.py. To train the model with the regression output layer, set flag is_regression to True. For example,

python cv_train.py --essay_set_id 1 --learning_rate 0.005 --epochs 200 --is_regression True

automated-essay-grading's People

Contributors

siyuanzhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

automated-essay-grading's Issues

UnicodeDecode Error

Hi,

Got below error while reading the training data:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 80: invalid start byte

Tried all the ways, but could not read the file,
Finally converted the file from tsv to txt , and could read it( but I feel this is a workaround rather than a solution)

Got similar error while reading the glove file.
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2438: character maps to

Please provide help on this.

Regards,
Pooja

Running error

There is no rate parameter in tf.nn.drop for tensorflow 1.10

AttributeError: _parse_flags in __getattr__

After running python cv_train.py --essay_set_id 1 in terminal, I get this error:

Traceback (most recent call last):
  File "cv_train.py", line 35, in <module>
    FLAGS._parse_flags()
  File "/home/aruhi/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/platform/flags.py", line 85, in __getattr__
    return wrapped.__getattr__(name)
  File "/home/aruhi/tensorflow/venv/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 470, in __getattr__
    raise AttributeError(name)
AttributeError: _parse_flags

Is logits_bias trainable?

Hello, I am confused about the variable logits_bias. It seems that you make logits_bias not trainable in your code. (At line 170 in memn2n_kv.py)

logits_bias = tf.get_variable('logits_bias', [score_range])

Normally, shouldn't bias be trainable?

Other possible mistakes I found:
1.

sent_size = sent_size_list.pop(score_idx)

Here sent_size_list is the list of all data(train+test), but score_idx is the index of train data.

cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=tf.cast(self._labels, tf.float32), name='cross_entropy')

why here logits are used to calculate cross entropy, but not the softmax result of logits.

Thanks!

Array element with a sequence

Followed the instructions. Using:
python2.7
tensorflow 1.3
Ubuntu 16.04
sklean 19.1

I can show the versions of the rest of the packages if that would help.

Got this error:

Traceback (most recent call last):
  File "cv_train.py", line 245, in <module>
    sess.run(tf.global_variables_initializer(), feed_dict={model.w_placeholder: word2vec})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 937, in _run
    np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
  File "/usr/local/lib/python2.7/dist-packages/numpy/core/numeric.py", line 531, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

any suggestions would be helpful.

syntax error

Got the following error while trying to run cv_train.py
E = data_utils.vectorize_data(essay_list, word_idx, max_sent_size)
^
SyntaxError: invalid syntax

kappa scores of 0

I've come back to this repo after some time and started testing again.

Some epoch folds are resulting in kappa scores of 0.

Is this know or expected?

Saving Model

Hi Authors,

Is there a procedure where I could save the weights of the model. Tried playing around but was not able to.

Please let me know at the earliest.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.