Code Monkey home page Code Monkey logo

Comments (6)

vincentvanhees avatar vincentvanhees commented on May 28, 2024

I completed the code for converting the African data (Utrecht group) to np arrays, these np arrays are now on the shared drive. The code is in notebook preproces_Guinea-Biseau.ipynb.

Data explanation:

  • I created a train, validation and test dataset for every experimental condition (eyes closed or eyes open), for 4 second time series, and for 10 seconds time series (3 x 2 x 2 = 12 dataset). Per dataset there is an X and a y file.
  • The test and validation dataset always have 20 individuals with proportionally the same number of individuals with epilepsy as in the total dataset, and never more than one time series (epoch) for the same individual.
  • The training dataset are all remaining individuals and all their available epochs. So, this means that in the training dataset there are multiple time series for some of the individuals.

The log.csv file is for my own reference. In this file I am keeping track of which pre-processed csv-files I used for every experimental condition. In this way I can make sure that I will use the same data for the shallow learning in R.

from mcfly.

vincentvanhees avatar vincentvanhees commented on May 28, 2024

The way I selected the data means that the proportion of individuals with epilepsy will be the same in the training, test and validation set, but the proportion of epochs is slightly different because for some individuals we will include multiple epochs in the training dataset.

The proportions of epochs for Controls (no epilepsy) out of the total number of epochs ranges between 37-44% in the training sets. In the test and validation dataset this is (always) 45%. However, the advantage of including all the epochs is that have between 34 and 81 epochs per group (control or epilepsy) per experimental condition in the training dataset, (compare this against the 9 controls 11 epilepsy patients in the test and validation sets).

from mcfly.

vincentvanhees avatar vincentvanhees commented on May 28, 2024

Possibly relevant for mcfly:
The prelimenary performance of my shallow learning approach on the test set:
Cohen-Kappa coefficient: 0.27
Area under curve: 0.808
(Cohen-Kappa coefficient in model training phase was 0.47)

However, I am not using the validation set at the moment. The code as I have it defines its own validation set as a subsample of the training dataset. This is obviously something I will have to address. Nonetheless, I hope that these performance estimates will only improve after further enhancements of the code.

from mcfly.

vincentvanhees avatar vincentvanhees commented on May 28, 2024

just discovered that there is a bug in how i generated the data. I am now fixing this and will put new data on the sharedrived soon

from mcfly.

vincentvanhees avatar vincentvanhees commented on May 28, 2024

just updated my analyses followed the bug fixed earlier today.
For protocol = eyes open:
New shallow learning results in test set are: Kappa = 0.596 en AUC = 0.778 in test set
For protocol = eyes closed:
New shallow learning results in test set are: Kappa = 0.490 en AUC = 0.833 in test set

Seems like a nice benchmark for Keras to compete with.

from mcfly.

vincentvanhees avatar vincentvanhees commented on May 28, 2024

More elaborate overview of shallow results in guinnea-bissea dataset, now with set.seed constant (forgot to do that in previous run). All results are based on random forrest classification, AUC = Areas under ROC curve:

Protocol: eyes closed
Timewindow: 4 seconds
Wavelet: la10
AUC in test set: 0.78
Kappa coefficient in test set: 0.39
Accuracy in test set: 0.70
Confusion matrix (prediction in row, truth in columns):
control 6 3
epilepsy 3 8

Protocol: eyes open
Timewindow: 4 seconds
Wavelet: la16
AUC in test set: 0.85
Kappa coefficient in test set: 0.38
Accuracy in test set: 0.70
Confusion matrix (prediction in row, truth in columns):
Control 5 4
Epilepsy 2 9

Protocol: eyes closed
Timewindow: 10 seconds
Wavelet: d2
AUC in test set: 0.90
Kappa coefficient in test set: 0.69
Accuracy in test set: 0.85
Confusion matrix (prediction in row, truth in columns):
control 7 2
epilepsy 1 10

Protocol: eyes open
Timewindow: 10 seconds
Wavelet: d10
AUC in test set: 0.83
Kappa coefficient in test set: 0.29
Accuracy in test set: 0.65
Confusion matrix (prediction in row, truth in columns):
Control 5 4
Epilepsy 3 8

from mcfly.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.