njfritter / characteristic-based-time-series-clustering Goto Github PK
View Code? Open in Web Editor NEWTime Series Clustering Based on Characteristic Based Feature Extraction inspired by the paper mentioned in the README
Time Series Clustering Based on Characteristic Based Feature Extraction inspired by the paper mentioned in the README
Once the initial pass through of feature extraction + clustering + analysis is done (with the code reproduced from the 2012 R code), start using tsfresh and compare the quality of features being extracted.
Couple of final items for the exploratory analysis:
Will update as needed.
I decided to divide up the directories by category of code (i.e. code I attempted to make from 2006 paper, actual code from 2012, tsfresh, etc.) but this is confusing and would end up having a lot of duplicate scripts and notebooks since there's also exploratory data analysis for each data set.
I will either:
Will update this issue once started.
After exploratory analysis is finished, begin extracting features and initial stages of clustering data.
Already begun here: #4
After exploratory analysis of the data, extract features and cluster together. Some interesting use cases:
Found a nifty Python Package to get the data here: https://github.com/swar/nba_api
Can easily get the data and perform analysis. Will also likely store this data in a database somewhere so I can easily access it and integrate it into an application viewable by URL.
This project started out with me trying to reproduce the Characteristic Based Time Series Clustering Paper calculations in code. Currently it is kind of a mess, so it needs to be invested in and fixed so it will work with any time series data.
The last part of the whole time series clustering exercise is to validate the quality of the clusterings. In order to properly do this, one must cluster on two separate subsets of our data (drawing from the same distribution) and compare.
However, since clustering is unsupervised the labels from clustering don't have any significant meaning in themselves (i.e. clustering 1 could lead to labels A/B/C and Clustering 2 could lead to labels D/E/F) so we need a way to compare them.
The Hungarian problem (Specifically minimum weighted biparite matching) is how these clustering labels can be compared. This issue is to figure out how to properly code it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.