Code Monkey home page Code Monkey logo

machine_learning's Introduction

Machine Learning

python 3.5 & tensorflow

The place to store my toy models for ML. Most of these codes are constructed within the framework of Tensorflow. For visualization, some additional packages are used. (such as pygame.)

Notice : Some of these codes are not well organized and still under development.

Currently, this repository includes:

a. MNIST handwritten digit recognition project ver2. (9/13/2018)

keywords : Convolution Neural Network (CNN), Fully Connected Layer, writing panel

Description :

The basic neural network framework is based on the work of Aymeric Damien

Training your own neural network, saving and restoring the tf.data for later use/retrain. Fully Connected Layer and Convolution Layer are implemented in this project. A writing panel is built (using pygame), users can interact with their own neural network directly.

results & future work:

Although both neural network can perform marvelously well in MNIST data, in terms of cost-performance ratio (can easily achieve accuracy > 95% within a few minute training). For both, the ability of recognition for the handwritten digits on writing panel are quite low-level. To improve this, my assumptions is:

  • Need to build a larger neural network, more and more compulational costs are required.
  • The handwritten digits on writing panel maybe quite different from the real handwritten digits in the MINST database. My handwritten digits are nothing but a scrawl as my elementary school teacher always told me.

b. seq2seq model (magica madoka) (9/23/2018)

keywords : Recurrent Neural Network (RNN), bidirectional RNN (bi-RNN) , long-short term memory (LSTM)

Description :

This work is inspired by Ematvey's awesome tutorials for seq2seq!

Full script of madoka magica. 魔法少女まどか☆マギカ台本

The basic framework is biRNN (LSTM). In this model, the word vector is also trainable. (Can be view as another full connected layer) Feed all the conversation in madoka magica into seq2seq model. Details are preparing ... Training time : ~ 12 hour

Caution : python dict and set are hash method, which can speed up the searching process while it also means to random generate a index for each word so the order of items in dictionary will be different if we restart the Console. So in encoder-decoder process, one must to notice that we use the same dictionary (it would be better to save the dictionary in .json or something else)

results & future work:

The ability to recognition the words with similar meaning are very weak, which is quite reasonable since I only feed this 1 MB or so .txt file for training.

Using Work2Vec and larger data.

I am ready to delve into analyzing (>200 MB) txt data ...

For larger conversation : Attention model are needed.

Personalized word vector project : After

c. seq2seq model (64 MB novel_conversation) (9/26/2018 - )

keywords : bi-RNN, LSTM, genism, Word2Vec, Attention mechanism, Scheduled sampling

Description :

Vocab_separator : Statistical method (MeCab is more efficient and elegant... while it is not friendly window user...)

Traingin word2Vec : 256 MB .txt data

Conversation data : 64 MB .txt data

Vocabulary amount : 40849 +2 (#EOS, #GO)

word2vec vector length : 200 (So I have 40851 x 200 matrix )

Embedding matrix is set to be not trainable tensor.

Structure : Singel biRNN layer.

amount of hidden units in biRNN : 400

amount of hidden units in attention layer :400

Training time : ~

machine_learning's People

Contributors

hchungdelta avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.