Comments (6)
I completed the code for converting the African data (Utrecht group) to np arrays, these np arrays are now on the shared drive. The code is in notebook preproces_Guinea-Biseau.ipynb.
Data explanation:
- I created a train, validation and test dataset for every experimental condition (eyes closed or eyes open), for 4 second time series, and for 10 seconds time series (3 x 2 x 2 = 12 dataset). Per dataset there is an X and a y file.
- The test and validation dataset always have 20 individuals with proportionally the same number of individuals with epilepsy as in the total dataset, and never more than one time series (epoch) for the same individual.
- The training dataset are all remaining individuals and all their available epochs. So, this means that in the training dataset there are multiple time series for some of the individuals.
The log.csv file is for my own reference. In this file I am keeping track of which pre-processed csv-files I used for every experimental condition. In this way I can make sure that I will use the same data for the shallow learning in R.
from mcfly.
The way I selected the data means that the proportion of individuals with epilepsy will be the same in the training, test and validation set, but the proportion of epochs is slightly different because for some individuals we will include multiple epochs in the training dataset.
The proportions of epochs for Controls (no epilepsy) out of the total number of epochs ranges between 37-44% in the training sets. In the test and validation dataset this is (always) 45%. However, the advantage of including all the epochs is that have between 34 and 81 epochs per group (control or epilepsy) per experimental condition in the training dataset, (compare this against the 9 controls 11 epilepsy patients in the test and validation sets).
from mcfly.
Possibly relevant for mcfly:
The prelimenary performance of my shallow learning approach on the test set:
Cohen-Kappa coefficient: 0.27
Area under curve: 0.808
(Cohen-Kappa coefficient in model training phase was 0.47)
However, I am not using the validation set at the moment. The code as I have it defines its own validation set as a subsample of the training dataset. This is obviously something I will have to address. Nonetheless, I hope that these performance estimates will only improve after further enhancements of the code.
from mcfly.
just discovered that there is a bug in how i generated the data. I am now fixing this and will put new data on the sharedrived soon
from mcfly.
just updated my analyses followed the bug fixed earlier today.
For protocol = eyes open:
New shallow learning results in test set are: Kappa = 0.596 en AUC = 0.778 in test set
For protocol = eyes closed:
New shallow learning results in test set are: Kappa = 0.490 en AUC = 0.833 in test set
Seems like a nice benchmark for Keras to compete with.
from mcfly.
More elaborate overview of shallow results in guinnea-bissea dataset, now with set.seed constant (forgot to do that in previous run). All results are based on random forrest classification, AUC = Areas under ROC curve:
Protocol: eyes closed
Timewindow: 4 seconds
Wavelet: la10
AUC in test set: 0.78
Kappa coefficient in test set: 0.39
Accuracy in test set: 0.70
Confusion matrix (prediction in row, truth in columns):
control 6 3
epilepsy 3 8
Protocol: eyes open
Timewindow: 4 seconds
Wavelet: la16
AUC in test set: 0.85
Kappa coefficient in test set: 0.38
Accuracy in test set: 0.70
Confusion matrix (prediction in row, truth in columns):
Control 5 4
Epilepsy 2 9
Protocol: eyes closed
Timewindow: 10 seconds
Wavelet: d2
AUC in test set: 0.90
Kappa coefficient in test set: 0.69
Accuracy in test set: 0.85
Confusion matrix (prediction in row, truth in columns):
control 7 2
epilepsy 1 10
Protocol: eyes open
Timewindow: 10 seconds
Wavelet: d10
AUC in test set: 0.83
Kappa coefficient in test set: 0.29
Accuracy in test set: 0.65
Confusion matrix (prediction in row, truth in columns):
Control 5 4
Epilepsy 3 8
from mcfly.
Related Issues (20)
- add cron builds
- fix test_find_best_architecture_with_class_weights
- Update build workflow
- Prepare some regression dataset
- Write a tutorial on regression functionality
- Create a plan for regression implementation HOT 1
- Not all training data in memory at the same time
- advertise that we can do regression in all our docs
- create new release (christmas edition) HOT 1
- explain in the docs how we automatically detect regression/classification data
- implement regression functionality in code
- Adapt defaults for evaluation metrics HOT 1
- find_best_architecture fails with tf.keras.metrics objects
- create sphinx build action for documentation build testing
- update supported python versions in tutorial docs and CI workflow HOT 2
- throw sensible error when using a datagenerator and subset!=None
- Add Transformer architecture
- Early stopping a bit aggressive for low number of epochs
- We should shuffle the data by default
- Consider this architecture TCNForecaster (2018)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mcfly.