nlesc / mcfly-tutorial Goto Github PK

tutorial for mcfly repository

License: Apache License 2.0

Jupyter Notebook 97.60% Python 2.40%

mcfly-tutorial's Introduction

This repository contains notebooks that show how to use the mcfly software. Mcfly is deep learning tool for time series classification.

Tutorials

Currently we offer two tutorials here. Our main tutorial can be found in the notebook notebooks/tutorial/tutorial.ipynb. This tutorial will let you train deep learning models with mcfly on the PAMAP2 dataset for activity recognition.

A comparable, slightly quicker tutorial can be found in the notebook notebooks/tutorial/tutorial_quick.ipynb. This tutorial will let you train deep learning models with mcfly on the RacketSports dataset for activity recognition.

Prerequisites:

Python 3.7 and above
The following python packages have to be installed (also specified in requirements.txt file):
- mcfly
- jupyter
- pandas
- matplotlib
- scipy
- numpy

Installation

python3 -m venv env
. env/bin/activate
pip install --upgrade pip setuptools
pip install -r requirements.txt

Running the notebooks

The tutorials can be run using Jupyter notebook. From the tutorial root folder run:

jupyter notebook

There are two versions of the tutorial. The standard tutorial is for self-learning. There is also a version for workshops which is only expected to be used with the aid of an instructor.

mcfly-tutorial's People

Contributors

Stargazers

Watchers

mcfly-tutorial's Issues

Add questions/formative assessment

@dafnevk commented on Thu Mar 02 2017

Add questions in the tutorial notebook to assess the understanding of new concepts, after:

generate models
find_best_model
visualization

Add explanation to visualiziation

@dafnevk commented on Thu Mar 02 2017

Add an information box that explains what file should be uploaded and what the default location might be

continuous-multioutput error

First of all, I must thank you for this wonderful library, it really helps for beginners like me.

However, I got the above error using find_best_architecture, my y is an array of shape( , 8), and the elements are continuous. I am trying to predict the next price, by output % of the range in 8 price slots. for example, the 8 price slots could be close-40, close-30, close-20, close-10, close+10, close+20, close+30, close+40. I break the price range into few of these 8 slots calculating the amount it ranges within the slot. so (0,0,0.2,0.3,0.5,0.3,0,0) means 20% of the range in close-20 slot, 30% in close-10 slot, 50% in close+10 slot, 30% in close+20 slot. Is this output supported? If not, how to re-model it? Thanks.

Random validation set in preprocessing of pamap2

Right now, the preprocessing of Pamap2 for the tutuorial, separates some subjects for validation and test, but this gives not good performance.
We want a random subset for validation, with a parameter for the validation set size.

Create new tutorial based on mcfly 1.1.0

Once mcfly v1.1.0 is released, update the tutorial notebook to:

make use of the visualization function
import functions directly from mcfly (instead of from underlying modules)

documentation modelgen.generate_models

function documentation for modelgen.generate_models() currently states that a list of models is generated, but it does not clarify that each 'model' in the generated list is a tuple including the Keras model object, Parameters, and model_types.

Create README in tutorial folder

@dafnevk commented on Thu Mar 02 2017

The README should contain instructions on how to open the tutorial/workshop notebook and how to check whether installation and data download was successful.

folder name tutorial -> utils

change folder name tutorial -> utils

Add pandas as dependency

The instructions on installation don't mention pandas as a prerequisite, but it is needed by the tutorial.

Provide original label names and show cross table

@dafnevk commented on Wed Feb 08 2017

To explain what we're actually doing with the PAMAP2 dataset, it helps to have the original labels so we can show the cross table for the activities.

Migrate tutorial to new repository

@dafnevk commented on Thu Mar 02 2017

make new repository with name: mcfly-tutorial
migrate notebooks en tutorial to new repository
restructure folder structure:
- folder names
- requirements.txt

Suggestions after doing tutorial

Text under cell 5 is a bit longwinded. Maybe split up in bullet points.
What is the idea behind Question 1? It's just 9 right? It says right in the text... bit too childish imho.
In general: a lot of reading required.
Put "modelgen.generate_models?" in a cell!
Question 3: ... I can, no idea about beginners.
Q4: maybe replace with "Look at how the accuracy and loss evolve in your training run."

Tutorial does not explain goal and what we are going to do

The tutorial does not tell us up front what we are going to learn and to what end. Maybe add a section to the start explaining that?

Use GPU for the tutorial code

Hi there!

Is there a possibility to use a GPU for the tutorial? If so, could someone help me with changing the code such that it accesses a GPU?

Best regards

Replace travis CI by GitHub actions

Same as done for mcfly itself....

All sequences in the data set should be of equal length.

In the documentation, it is specified that "All sequences in the data set should be of equal length.". What is the recommended approach for dealing with Multichannel time series event classification using samples with variable length? Zero padding perhaps?

Prepare flow / complete notebook for workshop

@dafnevk commented on Wed Feb 08 2017

This notebook contains the code that (ideally) is produced by the end of the workshop. This serves as an speakers note for the teacher. It should also contain places to ask questions, or give explanation.
It is probably a good idea to base it on the tutorial-notebook.

Create almost-empty workshop notebook

@dafnevk commented on Wed Feb 08 2017

The workshop notebook should just contain the necessary imports and a test that shows if installation was succesful

Look through tutorial notebook

@dafnevk commented on Thu Mar 02 2017

The complete tutorial notebook should be up to date and should be checked after all other tutorial issues are done.

hosting pre-processed data PAMAP2

Host pre-processed PAMAP2 data somewhere

Make cheatsheet

@dafnevk commented on Thu Mar 02 2017

Make cheatsheet (.md file) in the tutorial folder with all important concepts and functions and other jargon words. Sync this with the docstrings.

Error while trying to install mcfly

Hello,

I want to start using mcfly but I have problems to install the package. I followed the instruction on the installation page but I get the following error:

C:\Users\KnutzenS>pip install --trusted-host pypi.python.org mcfly
Collecting mcfly
Downloading mcfly-1.0.1.tar.gz
Requirement already satisfied: matplotlib in c:\users\knutzens\appdata\local\continuum\anaconda3\lib\site-packages (from mcfly)
Requirement already satisfied: numpy in c:\users\knutzens\appdata\local\continuum\anaconda3\lib\site-packages (from mcfly)
Requirement already satisfied: scikit-learn>=0.15.0 in c:\users\knutzens\appdata\local\continuum\anaconda3\lib\site-packages (from mcfly)
Requirement already satisfied: scipy>=0.11 in c:\users\knutzens\appdata\local\continuum\anaconda3\lib\site-packages (from mcfly)
Exception:
Traceback (most recent call last):
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\site-packages\pip\basecommand.py", line 215, in main
status = self.run(options, args)
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\site-packages\pip\commands\install.py", line 335, in run
wb.build(autobuilding=True)
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\site-packages\pip\wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\site-packages\pip\req\req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\site-packages\pip\req\req_set.py", line 666, in _prepare_file
check_dist_requires_python(dist)
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\site-packages\pip\utils\packaging.py", line 48, in check_dist_requires_python
feed_parser.feed(metadata)
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\email\feedparser.py", line 175, in feed
self._input.push(data)
File "C:\Users\KnutzenS\AppData\Local\Continuum\Anaconda3\lib\email\feedparser.py", line 103, in push
self._partial.write(data)
TypeError: string argument expected, got 'NoneType'

I am able to install other packages with pip so I do not think that it's a general problem with pip.
I am using:

pip 9.0.1
Python 3.6.3 :: Anaconda, Inc.

Training full model takes too long

Training the full model takes ~2 hours on my laptop. Perhaps a use a pretrained model, or less data?

Fix linter issues and turn on linter checking in CI

We have a linter profile in place, and a Github Action that is checking the style, but the linter returns many style messages.

fix these issues
turn on linter in GA (remove the --zero-exit flag in the workflow)

Example of the list with linter messages at the time of writing this issue:

Messages

scripts/Actitracker_train.py
Line: 6
pylint: unused-import / Unused import sys
Line: 9
pylint: import-error / Unable to import 'pandas'
Line: 12
pylint: unused-import / Unused storage imported from mcfly
Line: 55
pep8: E117 / over-indented (col 9)
pylint: bad-indentation / Bad indentation. Found 8 spaces, expected 4
Line: 59
pylint: unexpected-keyword-arg / Unexpected keyword argument 'outputpath' in function call (col 40)
Line: 98
pylint: trailing-newlines / Trailing newlines

scripts/EEG_alcoholic_train.py
Line: 6
pylint: unused-import / Unused import sys
Line: 9
pylint: import-error / Unable to import 'pandas'
pylint: unused-import / Unused pandas imported as pd
Line: 11
pylint: unused-import / Unused storage imported from mcfly
Line: 59
pep8: E117 / over-indented (col 9)
pylint: bad-indentation / Bad indentation. Found 8 spaces, expected 4
Line: 65
pylint: unexpected-keyword-arg / Unexpected keyword argument 'early_stopping' in function call (col 40)
Line: 79
pylint: trailing-newlines / Trailing newlines

scripts/experiment_PAMAP.py
Line: 13
pylint: import-error / Unable to import 'pandas'
Line: 15
pylint: unused-import / Unused storage imported from mcfly
Line: 16
pylint: import-error / Unable to import 'keras.models'
pylint: unused-import / Unused load_model imported from keras.models
Line: 26
pep8: E501 / line too long (233 > 159 characters) (col 160)
Line: 59
pylint: pointless-statement / Statement seems to have no effect
Line: 100
pep8: E117 / over-indented (col 9)
pylint: bad-indentation / Bad indentation. Found 8 spaces, expected 4
Line: 149
pylint: trailing-newlines / Trailing newlines

scripts/experiment_PAMAP2_9fold.py
Line: 12
pylint: unused-import / Unused import sys
Line: 15
pylint: import-error / Unable to import 'pandas'
pylint: unused-import / Unused pandas imported as pd
Line: 37
pep8: E117 / over-indented (col 9)
pylint: bad-indentation / Bad indentation. Found 8 spaces, expected 4
Line: 58
pep8: E501 / line too long (527 > 159 characters) (col 160)
Line: 81
pep8: E501 / line too long (290 > 159 characters) (col 160)
Line: 105
pylint: import-error / Unable to import 'keras.optimizers'
Line: 106
pylint: import-error / Unable to import 'keras.models'
Line: 134
pylint: unexpected-keyword-arg / Unexpected keyword argument 'early_stopping' in function call (col 44)
Line: 154
pylint: not-an-iterable / Non-iterable value len(Xs) is used in an iterating context (col 9)
Line: 191
pep8: E501 / line too long (343 > 159 characters) (col 160)
Line: 236
pylint: trailing-newlines / Trailing newlines

scripts/pamap2.py
Line: 10
pylint: import-error / Unable to import 'pandas'
Line: 65
pylint: pointless-statement / Statement seems to have no effect

tests/test_tutorial_pamap2.py
Line: 3
pylint: import-error / Unable to import 'pandas'
Line: 4
pylint: import-error / Unable to import 'nose.tools'
pylint: unused-import / Unused assert_equal imported from nose.tools
Line: 5
pylint: unused-import / Unused listdir imported from os
Line: 20
pylint: unused-variable / Unused variable 'splitted_y' (col 19)
Line: 32
pylint: unused-variable / Unused variable 'y_train' (col 17)
Line: 69
pep8: E712 / comparison to True should be 'if cond is True:' or 'if cond:' (col 17)
pylint: singleton-comparison / Comparison to True should be just 'expr' (col 11)
Line: 86
pylint: import-outside-toplevel / Import outside toplevel (tensorflow.keras.models.load_model) (col 8)
Line: 90
pep8: E305 / expected 2 blank lines after class or function definition, found 1 (col 1)

utils/init.py
Line: 1
pyflakes: F401 / '.tutorial_pamap2.*' imported but unused (col 1)

utils/tutorial_pamap2.py
Line: 8
pylint: import-error / Unable to import 'pandas'
Line: 79
pylint: consider-using-enumerate / Consider using enumerate instead of iterating with range and len (col 4)
Line: 134
pylint: consider-using-enumerate / Consider using enumerate instead of iterating with range and len (col 4)
Line: 204
pylint: unused-variable / Unused variable 'local_fn' (col 12)
Line: 250
pylint: consider-using-in / Consider merging these comparisons with "in" to 'tty in ("<class 'slice'>", "<type 'slice'>")' (col 7)
Line: 254
pylint: unnecessary-comprehension / Unnecessary use of a comprehension
Line: 255
pylint: unnecessary-comprehension / Unnecessary use of a comprehension
Line: 272
pylint: useless-return / Useless return at end of function or method
Line: 452
pylint: unused-variable / Unused variable 'local_fn' (col 12)

utils/tutorial_racketsports.py
Line: 33
pylint: unused-variable / Unused variable 'local_fn' (col 12)
Line: 77
pylint: useless-return / Useless return at end of function or method
Line: 143
pylint: unused-variable / Unused variable 'local_fn' (col 12)
Line: 169
pep8: E741 / ambiguous variable name 'l' (col 21)

utils/tutorial_vu.py
Line: 7
pylint: import-error / Unable to import 'xlrd'
Line: 47
pylint: logging-format-interpolation / Use lazy % formatting in logging functions (col 20)
Line: 49
pylint: logging-format-interpolation / Use lazy % formatting in logging functions (col 20)
Line: 51
pylint: logging-format-interpolation / Use lazy % formatting in logging functions (col 20)
Line: 53
pylint: logging-format-interpolation / Use lazy % formatting in logging functions (col 20)
Line: 119
pylint: logging-format-interpolation / Use lazy % formatting in logging functions (col 20)

Unable to load Data in second cell

in example of Tutorial PAMAP2 with mcfly second cell does not work

sys.path.insert(0, os.path.abspath('../..'))
from utils import tutorial_pamap2
raise error
ImportError Traceback (most recent call last)
in ()
1 sys.path.insert(0, os.path.abspath('../..'))
----> 2 from utils import tutorial_pamap2

ImportError: cannot import name 'tutorial_pamap2'

Work on powerpoint with introduction

@dafnevk commented on Thu Mar 02 2017

Oral explanation in talk or during workshop:
Explain what deep learning tries to solve
why should researchers be interested in learning / using deep learning
they need classification
what kind / group of research questions fit
why not use classical statistics (classical vs DL approach)
To avoid creating customized features, if desired & To create novel feature you may not have thought about & To classify complex pattern constructed from simpler patterns (and give cat example)
DL is also partly unexplored terrain and therefore may add novelty to their field
Explain what mcfly tries to solve
Ease the hyperparameter and architecture choice
Why would I use it instead of caffe tensor flow
Specific for time series
It uses tensorflow and keras, and mcfly provides an easy access way to start learning Keras (it has a less steep learning curve)
Up to date with latest trends in 2016 informed by literature
use expertise to make the choices so that the users don’t have to
Explain Logistic regression and/or single node in a neural network?

2 models is not enough for visualization

The visualization is currently using only two models, but this does not demonstrate the utility of the visualization. Maybe have some pretrained models, such that we can demonstrate the visualization with ~15 models?

fix tests in tutorials repo

make everything green

fix test_find_best_architecture_with_class_weights in ci

tests/test_find_architecture.py::FindArchitectureBasicSuite::test_find_best_architecture_with_class_weights

https://github.com/NLeSC/mcfly/actions/runs/3693703246/jobs/6254006520

Run coverage run --source=mcfly -m pytest
============================= test session starts ==============================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
rootdir: /home/runner/work/mcfly/mcfly
plugins: cov-4.0.0
collected 52 items

tests/test_base_hyperparameter_generator.py ...                          [  5%]
tests/test_cnn.py .......                                                [ 19%]
tests/test_deep_conv_lstm.py ....                                        [ 26%]
tests/test_find_architecture.py .F.................                      [ 63%]
tests/test_inception_time.py ......                                      [ 75%]
tests/test_integration.py .                                              [ 76%]
tests/test_modelgen.py ...                                               [ 82%]
tests/test_resnet.py ......                                              [ 94%]
tests/test_storage.py ...                                                [100%]

=================================== FAILURES ===================================
__ FindArchitectureBasicSuite.test_find_best_architecture_with_class_weights ___

self = <test_find_architecture.FindArchitectureBasicSuite testMethod=test_find_best_architecture_with_class_weights>

    def test_find_best_architecture_with_class_weights(self):
        """Model should not ignore tiny class with huge class weight. Note that this test is non-deterministic,
        even though a seed was set. Note2 that this test is very slow, taking up 40% of all mcfly test time."""
        tf.random.set_seed(1234)  # Needs tensorflow API v2
    
        X_train, y_train = _create_2_class_labeled_dataset(1, 999)  # very unbalanced
        X_val, y_val = _create_2_class_labeled_dataset(1, 99)
        X_test, y_test = _create_2_class_labeled_dataset(10, 10)
        class_weight = {0: 2, 1: 0.002}
    
        best_model, _, _, _ = find_architecture.find_best_architecture(
            X_train, y_train, X_val, y_val, verbose=False, subset_size=1000,
            number_of_models=5, nr_epochs=1, model_type='CNN', class_weight=class_weight)
    
        probabilities = best_model.predict(X_test)
        predicted = probabilities.argmax(axis=1)
>       np.testing.assert_array_equal(predicted, y_test.argmax(axis=1))
E       AssertionError: 
E       Arrays are not equal
E       
E       Mismatched elements: 10 / 20 (50%)
E       Max absolute difference: 1
E       Max relative difference: 0.
E        x: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
E        y: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

tests/test_find_architecture.py:224: AssertionError
----------------------------- Captured stdout call -----------------------------
Set maximum kernel size for InceptionTime models to number of timesteps.
Set maximum kernel size for InceptionTime models to number of timesteps.
Generated models will be trained on subset of the data (subset size: 1000).

1/1 [==============================] - ETA: 0s����������������������������������������������
1/1 [==============================] - 1s 726ms/step
------------------------------ Captured log call -------------------------------
WARNING  absl:optimizer.py:106 `lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.Adam.
WARNING  absl:optimizer.py:106 `lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.Adam.
WARNING  absl:optimizer.py:106 `lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.Adam.
WARNING  absl:optimizer.py:106 `lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.Adam.
WARNING  absl:optimizer.py:106 `lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.Adam.
=============================== warnings summary ===============================
tests/test_base_hyperparameter_generator.py::test_regularization_is_float
  /home/runner/work/mcfly/mcfly/tests/test_base_hyperparameter_generator.py:8: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    assert isinstance(reg, np.float), "Expected different type."

tests/test_find_architecture.py::FindArchitectureBasicSuite::test_find_best_architecture
tests/test_integration.py::IntegrationSuite::test_integration
tests/test_storage.py::StorageSuite::test_loadmodel
tests/test_storage.py::StorageSuite::test_savemodel
tests/test_storage.py::StorageSuite::test_savemodel_keras
  /home/runner/work/mcfly/mcfly/mcfly/modelgen.py:86: UserWarning: Specified number_of_models is smaller than the given number of model types.
    warnings.warn("Specified number_of_models is smaller than the given number of model types.")

tests/test_find_architecture.py::FindArchitectureBasicSuite::test_find_best_architecture
  /home/runner/work/mcfly/mcfly/mcfly/find_architecture.py:382: UserWarning: Best model not better than kNN: [] vs  0.6666666666666666
    warnings.warn('Best model not better than kNN: ' +

tests/test_find_architecture.py::FindArchitectureBasicSuite::test_find_best_architecture_with_class_weights
  /home/runner/work/mcfly/mcfly/mcfly/find_architecture.py:382: UserWarning: Best model not better than kNN: [] vs  1.0
    warnings.warn('Best model not better than kNN: ' +

tests/test_find_architecture.py::HistoryStoringSuite::test_store_train_history_as_json_contains_expected_attributes
tests/test_find_architecture.py::HistoryStoringSuite::test_store_train_history_as_json_contains_expected_attributes
tests/test_find_architecture.py::HistoryStoringSuite::test_store_train_history_as_json_metrics_is_dict
tests/test_find_architecture.py::HistoryStoringSuite::test_store_train_history_as_json_metrics_is_dict
  /home/runner/work/mcfly/mcfly/tests/test_find_architecture.py:342: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    'val_loss': [np.float(1), np.float(1)], 'val_accuracy': [np.float64(0), np.float64(0)],

tests/test_storage.py::StorageSuite::test_loadmodel
tests/test_storage.py::StorageSuite::test_savemodel
  /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/numpy/lib/npyio.py:521: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
    arr = np.asanyarray(arr)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_find_architecture.py::FindArchitectureBasicSuite::test_find_best_architecture_with_class_weights - AssertionError: 
Arrays are not equal

Mismatched elements: 10 / 20 (50%)
Max absolute difference: 1
Max relative difference: 0.
 x: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
 y: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
============ 1 failed, 51 passed, 14 warnings in 237.59s (0:03:57) =============
Error: Process completed with exit code 1.

"eecology" dataset undefined

I'm working on the deeplearning_eecology.ipynb

The notebook refers to a data resource that appears to be local to your test site -

datapath = '/media/sf_vmshared/timeseries/eecology/'

Can you share what dataset this is, and where I can get it, so I can pursue this exercise?

Thanks!

Add info about visualization in notebook

@dafnevk commented on Thu Mar 02 2017

Clearly explain which file should be uploaded to the visualization

Make example output models.json

@dafnevk commented on Wed Feb 08 2017

Let the tutorial notebook run for 10 models and provide the resulting json file somewhere.

Loading model does not work

In the tutorial, loading the pre-trained model does not work.

model = load_model('./model/model.h5')

Gives:

---------------------------------------------------------------------------
SystemError                               Traceback (most recent call last)
<ipython-input-26-5d87631adf41> in <module>()
----> 1 model = load_model('./model/model.h5')

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/models.py in load_model(filepath, custom_objects, compile)
    237             raise ValueError('No model found in config file.')
    238         model_config = json.loads(model_config.decode('utf-8'))
--> 239         model = model_from_config(model_config, custom_objects=custom_objects)
    240 
    241         # set weights

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/models.py in model_from_config(config, custom_objects)
    311                         'Maybe you meant to use '
    312                         '`Sequential.from_config(config)`?')
--> 313     return layer_module.deserialize(config, custom_objects=custom_objects)
    314 
    315 

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/layers/__init__.py in deserialize(config, custom_objects)
     52                                     module_objects=globs,
     53                                     custom_objects=custom_objects,
---> 54                                     printable_module_name='layer')

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
    137                 return cls.from_config(config['config'],
    138                                        custom_objects=dict(list(_GLOBAL_CUSTOM_OBJECTS.items()) +
--> 139                                                            list(custom_objects.items())))
    140             with CustomObjectScope(custom_objects):
    141                 return cls.from_config(config['config'])

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/models.py in from_config(cls, config, custom_objects)
   1212         for conf in config:
   1213             layer = layer_module.deserialize(conf, custom_objects=custom_objects)
-> 1214             model.add(layer)
   1215         return model
   1216 

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/models.py in add(self, layer)
    473                           output_shapes=[self.outputs[0]._keras_shape])
    474         else:
--> 475             output_tensor = layer(self.outputs[0])
    476             if isinstance(output_tensor, list):
    477                 raise TypeError('All layers in a Sequential model '

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
    600 
    601             # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 602             output = self.call(inputs, **kwargs)
    603             output_mask = self.compute_mask(inputs, previous_mask)
    604 

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/layers/core.py in call(self, inputs, mask)
    648         if has_arg(self.function, 'mask'):
    649             arguments['mask'] = mask
--> 650         return self.function(inputs, **arguments)
    651 
    652     def compute_mask(self, inputs, mask=None):

~/sw/miniconda3/envs/mcfly-tutorial/lib/python3.6/site-packages/keras/layers/core.py in <lambda>(x)
    176 
    177     # Input shape
--> 178         4D tensor with shape:
    179         `(samples, channels, rows, cols)` if data_format='channels_first'
    180         or 4D tensor with shape:

SystemError: unknown opcode

This is on macOS, with conda Python 3.6.2. pip freeze gives the following list:

appnope==0.1.0
bleach==1.5.0
certifi==2016.2.28
cycler==0.10.0
decorator==4.1.2
entrypoints==0.2.3
h5py==2.7.0
html5lib==0.9999999
ipykernel==4.6.1
ipython==6.1.0
ipython-genutils==0.2.0
ipywidgets==6.0.0
jedi==0.10.2
Jinja2==2.9.6
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.1.0
jupyter-console==5.2.0
jupyter-core==4.3.0
Keras==2.0.8
Markdown==2.6.9
MarkupSafe==1.0
matplotlib==2.0.2
mcfly==1.0.1
mistune==0.7.4
nbconvert==5.2.1
nbformat==4.4.0
notebook==5.0.0
numpy==1.13.1
pandas==0.20.3
pandocfilters==1.4.2
pexpect==4.2.1
pickleshare==0.7.4
prompt-toolkit==1.0.15
protobuf==3.4.0
ptyprocess==0.5.2
Pygments==2.2.0
pyparsing==2.2.0
python-dateutil==2.6.1
pytz==2017.2
PyYAML==3.12
pyzmq==16.0.2
qtconsole==4.3.1
scikit-learn==0.19.0
scipy==0.19.1
simplegeneric==0.8.1
six==1.10.0
tensorflow==1.3.0
tensorflow-tensorboard==0.1.6
terminado==0.6
testpath==0.3
tornado==4.5.2
traitlets==4.3.2
wcwidth==0.1.7
Werkzeug==0.12.2
widgetsnbextension==3.0.2

Make cheatsheet

@dafnevk commented on Thu Mar 02 2017

Make cheatsheet (.md file) in the tutorial folder with all important concepts and functions and other jargon words. Sync this with the docstrings.

ValueError: Invalid metric: "accuracy" is not among the metrics the models was compiled with ([]).

Hi, I am getting an error when executing this tutorial:

outputfile = os.path.join(resultpath, 'modelcomparison.json')
histories, val_accuracies, val_losses = mcfly.find_architecture.train_models_on_samples(X_train, y_train_binary,
                                                                           X_val, y_val_binary,
                                                                           models,nr_epochs=5,
                                                                           subset_size=300,
                                                                           verbose=True,
                                                                           outputfile=outputfile,
                                                                            metric=metric)
print('Details of the training process were stored in ',outputfile)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-25f24a279522> in <module>
      6                                                                            verbose=True,
      7                                                                            outputfile=outputfile,
----> 8                                                                             metric=metric)
      9 print('Details of the training process were stored in ',outputfile)

D:\Anaconda3\lib\site-packages\mcfly\find_architecture.py in train_models_on_samples(X_train, y_train, X_val, y_val, models, nr_epochs, subset_size, verbose, outputfile, model_path, early_stopping_patience, batch_size, metric, class_weight)
    115         if metric_name not in model_metrics:
    116             raise ValueError('Invalid metric: "{}" is not among the metrics the models was compiled with ({}).'
--> 117                              .format(metric_name, model_metrics))
    118         if early_stopping_patience is not None:
    119             if early_stopping_patience == 'auto':

ValueError: Invalid metric: "accuracy" is not among the metrics the models was compiled with ([]).

UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes

I encountered the above error. I have done some research and found the following:

"Typical error on Windows because the default user directory is C:\user<your_user>, so when you want to use this path as an string parameter into a Python function, you get a Unicode error, just because the \u is a Unicode escape. Any character not numeric after this produces an error.To solve it, just double the backslashes: C:\\user\<\your_user>..."

I checked the codes it is because of the lambda function in the generated model and the system tries to do a func_dump of modelgen.py in my c:\users\anaconda3\lib\site-packages\mcfly directory.

Any suggestion how to get around it?

Here is the detail trace:
File "C:\Users\pactera\Anaconda3\lib\site-packages\mcfly\storage.py", line 33, in savemodel
json_string = model.to_json() # save architecture to json string
File "C:\Users\pactera\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2541, in to_json
model_config = self._updated_config()
File "C:\Users\pactera\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2508, in _updated_config
config = self.get_config()
File "C:\Users\pactera\Anaconda3\lib\site-packages\keras\models.py", line 1179, in get_config
'config': layer.get_config()})
File "C:\Users\pactera\Anaconda3\lib\site-packages\keras\layers\core.py", line 668, in get_config
function = func_dump(self.function)
File "C:\Users\pactera\Anaconda3\lib\site-packages\keras\utils\generic_utils.py", line 177, in func_dump
code = marshal.dumps(func.code).decode('raw_unicode_escape')
UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes in position 82-83: truncated \UXXXXXXXX escape

Always same val_acc for LSTM

Hi,
I am training a Model for Timeseries Classification with 3 Labels.
But when a LSTM model is trained, the val_acc after each epoch is always 66,14% in my case. And it wont change with different Data or batchsize/epochs.

I Train on 1000 samples, validate on 7442 samples.

What could be the problem? Thank you

Compare weights instead of probabilities

In section 'Saving, loading and comparing reloaded model with original mode' To show that two models are actually the same, it might be nice to compare weights instead of probabilities.

Arrange usb-sticks

@dafnevk commented on Thu Mar 02 2017

Does the eScience Center have usb-sticks to use during a workshop? Can we have usb sticks that have the tutorial data permanently?

@mkuzak commented on Thu Mar 02 2017

We can buy few USB sticks which will be used for training activities. If there is enough space I don't see the reason why we could not store data for various workshops there.

rope_jumping
ascending_stairs
descending_stairs
running

nlesc / mcfly-tutorial Goto Github PK

mcfly-tutorial's Introduction

Tutorials

Installation

Running the notebooks

mcfly-tutorial's People

Contributors

Stargazers

Watchers

Forkers

mcfly-tutorial's Issues

Messages

Recommend Projects

Recommend Topics

Recommend Org