nlesc / mcfly Goto Github PK
View Code? Open in Web Editor NEWA deep learning tool for time series classification and regression
License: Apache License 2.0
A deep learning tool for time series classification and regression
License: Apache License 2.0
Brainstorm about architure (the engine of the tool), e.g.: did we implement all essential functionlaties?
In the CNN model, each convolution should be followed by a batch normalization layer before the activation layer. Look up details about batch normalization in for instance the dense layers.
see #34
Implement DTW and integrate it into the optimal model finding.
Build first classifier in Keras, locally (no DAS5): programming + training + evaluating using as a refernece: published UCR results or use shallow classifier if possible
Implement per channel CNN in normal CNN architecture, without LSTM. Add this to architecture generation functions.
I am listing some research ideas we could look into after there is a first prototype. These may need to be split into seperate issues but for now it may be easier to group them:
Sometimes learning stops with a completely stable accuracy and loss on val and test set. Find out why the process is stuck.
Wrap up code as a Python module and github documentation as a starting point for generic functionality
We want to be able to use and import our code for model generation so it needs to be in a module instead of a notebook in which it is now.
Gull dataset
What experiments are we going to do to show the value of our tool?
For now, this is random search.
Investigate whether the codes works and how long the runs take for one model.
Right now there is one uniform parameter search for e.g. learning rate.
It would be nice if there's an automatic search strategy from course to finegrained.
possible parameter ranges to be set etc.
Prepare small dataset UCR time series classification archive:
http://www.cs.ucr.edu/~eamonn/time_series_data/
The password for the data (link to 350 Mb zip) is: attempttoclassify
http://www.cs.ucr.edu/~eamonn/time_series_data/UCR_TS_Archive_2015.zip
Extend the functionality from issue #19 to incorporate architectures that are like those presented in the paper of Ordonez et al, 2016.
Build simple classifier for London data
Prepare dataset EEG Utrecht
Use a (cookiecutter) template to set up the repository structure according to the eStep guidelines.
Input:
Output:
Add regularization as parameter to optimize in modelgen.py.
Create architecture specific for one datasets (UCR) and stored the model objects
Input:
Output:
We should not assume that incoming data is normalized. The user should not be responsible for that as it requires a bit of expert knowledge to know normalization is important and how it should be done. An easy way to handle and automate this is to start each model with a batch normalization layer.
Done after:
setup gh-pages without content
Implement kNN
and add a function in optimal model search where kNN is applied to compjare with the best deep learning model,
I think the current number is way too low.
setup:
travis-ci
some code coverage tool
some code analysis tool
insert badges for all 3 into readme
SVG or similar that shows all learning curves together based on json output of training process. Models can be filtered with checkboxes for example.
Let's try to at least get the same result as kNN/RF.
Prepare dataset London: data format + take subset of two easy to recognize classes + standardize window length, e.g. 10 minutes
Rebuild the model architectures in modelgen.py for hyperas, and run them on a small dataset.
for now this will be some kind of numpy array. Document somewhere what the exact format is
includes:
data
labels
labels to class name mapping
comment in this item the location where this is documented
also file format
Don't return as json or similar the histories of all training processes. Write them to disk during or between training processes.
Y Zhengh proposed in 2014 that splitting multi-variate time series into univariate signals and processing them seperately as distinct branches of the DL architecture is better than processing them as a multi-variate CNN... I am not sure whether this makes sense, but it should be something we can easily test in Keras.
The article is on the onedrive, link: https://nlesc-my.sharepoint.com/personal/v_vanhees_esciencecenter_nl/_layouts/15/guestaccess.aspx?guestaccesstoken=cKHpfUmasCukMxT9YMnoLKvwtQiFlFYdJclcl%2buhcYM%3d&docid=17139ecaca7d5428ea3d184e04a4e59f5
Find out how to transfer the final hidden state for some sample on to the first hidden state for the next sample. This was done in Plotz (see literature on onedrive).
Reflect on progress and review the feature requirement list
Investigate and decide whether to implement DTW
Document what choices we made.
Model types, achitectures, default parameters etc.
Investigate what ideal shape of data input should be for Keras in the context of time series
For example, classify a fixed length numeric time series into classes. No visualisation at this stage, but clear report on loss, accuracy and hyper-parameter optimization.
End product: Document with written description of the deliverable, where possible clarified with diagrams/tables and add to gitbook
Read-up on Keras and continue literature review on time series
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.