lettergram / sentence-classification Goto Github PK
View Code? Open in Web Editor NEWSentence Classifications with Neural Networks
Home Page: https://austingwalters.com/neural-networks-to-production-from-an-engineer/
License: Other
Sentence Classifications with Neural Networks
Home Page: https://austingwalters.com/neural-networks-to-production-from-an-engineer/
License: Other
Hi,
I'm following your guide Neural Networks to Production, From an Engineer but I have problems accessing the site, it gives me a Database Error ( Error establishing a database connection ).
Hello,
Thank you for this awesome repo, giving me better understanding of different approach in text classification.
Please, is it possible to have access to the code used for the hyperparameter tuning?
Thank you.
Hi,
I was trying to test on pre-trained model(cnn). I successfully loaded your model with following commands :
# load json and create model
json_file = open(model_name + '.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
Now, when I tried to test it through following code:
test_comments, test_comments_category = get_custom_test_comments()
x_test, _, y_test, _ = encode_data(test_comments, test_comments_category,
data_split=1.0,
embedding_name=embedding_name,
add_pos_tags_flag=pos_tags_flag)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
y_test = keras.utils.to_categorical(y_test, num_classes)
score = model.evaluate(x_test, y_test,
batch_size=batch_size, verbose=1)
^this last line of model.evaluate resulted in error :
InvalidArgumentError: indices[13,490] = 22271 is not in [0, 15000)
[[Node: embedding_1_9/embedding_lookup = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@dropout_1_9/cond/Switch_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](embedding_1_9/embeddings/read, embedding_1_9/Cast, embedding_1_9/embedding_lookup/axis)]]
which I figured out that it might be because of word id which is contained in x_test, because value of max_words which we have is 15000 and maximum value in x_test is far greater than 15000 so it's not able to find words which have id greater than 15000. I tried to divide all the values of x_test by 100 and then converted all the values to integer. Then it successfully worked.
So , can you please suggest me If I am doing anything wrong, or any other word encoding needs to be loaded?
Thanks for the help.
Hey there, Is it possible to get the final dataset with splits? I intend to train a transformer model for classifying questions vs statements. I can later create a pull request and that model can integrated to this wonderful repo of classifiers that you already have. Thanks!
Hi, I want to use the pre-trained model to classify my sentences. But I am not that familiar with Deep Learning.
Here I have some questions:
While using default word_encoding if a word is not present in dictionary key it is given an encoding 0. I saw your default_word_encoding.json and found multiple words have value 0 and so on. If it is intended can I know how model differentiates between new words and seen words with same encoded value?
The values being printed by gen_test_comments
are
-------------------------
command 1672
statement 80993
question 131219
-------------------------
However, the actual values are 1111, 80167 and, 131001 for commands, statements and questions respectively. The values stored in the variables like command_count
include duplicate sentences.
While the tagged_comments
dict takes care of duplicate values, the counts still contain duplicates.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.