adriangb / scikeras Goto Github PK

View Code? Open in Web Editor NEW

239.0 4.0 46.0 41.9 MB

Scikit-Learn API wrapper for Keras.

Home Page: https://www.adriangb.com/scikeras/

License: MIT License

Python 100.00%

scikit-learn machine-learning python data-science deep-learning deep-neural-networks keras tensorflow wrappers

scikeras's Introduction

Scikit-Learn Wrapper for Keras

Scikit-Learn compatible wrappers for Keras Models.

Why SciKeras

SciKeras is derived from and API compatible with the now deprecated / removed tf.keras.wrappers.scikit_learn.

An overview of the differences as compared to the TF wrappers can be found in our migration guide.

Installation

This package is available on PyPi:

# Tensorflow
pip install scikeras[tensorflow]

Note that pip install scikeras[tensorflow] is basically equivalent to pip install scikeras tensorflow and is offered just for convenience. You can also install just SciKeras with pip install scikeras, but you will need a version of tensorflow installed at runtime or SciKeras will throw an error when you try to import it.

The current version of SciKeras depends on scikit-learn>=1.4.1post1 and Keras>=3.2.0.

Migrating from `keras.wrappers.scikit_learn`

Please see the migration section of our documentation.

Documentation

Documentation is available at https://www.adriangb.com/scikeras/.

Contributing

See CONTRIBUTING.md

scikeras's People

Contributors

Stargazers

Watchers

Forkers

jerry2990 stsievert gvravi senddon sim-san tymick olim-ibragimov achrafbouzekri data-hound silcard 321hg metasyn furyhawk ecokeco saikatkumardey rdk2132 bytakara jpgard mattalhonte-srm lkampoli baasitsharief un-gcpds richmanbtc cralji freephys pkleindatasets gbulbul aniekaninyang jeffery9876 bruaristimunha emcc81 vgopinathan cristian-rincon takanotume24 deadsg akapocsi djun bkellerman espositoandrea andresveraf hanchen92

scikeras's Issues

DOC: Add attribute documentation and constructor documentation to KerasClassifier and KerasRegressor

ENH: move data validation to a modular interface

The main use case for this is to allow non-array inputs (perhaps dicts, datasets, etc.). The default implementation would remain the same, but users would have the ability to customize it for more advanced use cases.

Once #88 is merged (and after the next release), we couls move the internal data validation interface (BaseWrapper._validate_data) to a public transformer based interface, similar to the X/y processing. I would however keep them seperate for modularity. The main complication would be: do we have two transformers (one for X, one for y)? How can they coordinate (eg: to validate the same number of data points)?

An alternative would be to just take the interface we have and make it public.

In either case, the goal would be to have no private/non-customizable roadblocks to using arbitrary data inputs.

Clarification: difference between _keras_build_fn arguments and instance attributes

Following the example below taken from the documentation

from scikeras.wrappers import KerasRegressor


class MLPRegressor(KerasRegressor):

    def __init__(self, hidden_layer_sizes=None):
        self.hidden_layer_sizes = hidden_layer_sizes

    def _keras_build_fn(self, meta, hidden_layer_sizes):
        """Dynamically build regressor."""
        if hidden_layer_sizes is None:
            hidden_layer_sizes = (100, )
        model = Sequential()
        model.add(Dense(meta["X_shape_"][1], activation="relu", input_shape=meta["X_shape_"][1:]))
        for size in hidden_layer_sizes:
            model.add(Dense(size, activation="relu"))
        model.add(Dense(meta["n_outputs_"]))
        model.compile("adam", loss=KerasRegressor.r_squared)
        return model

Why do I need to specify the hidden layer size as an argument for the _keras_build_fn?
Is there any difference between the above and the following implementation?
I removed the hidden_layer_size parameter from the _keras_build_fn since it is already present as instance attribute.
Is there anything I did not consider which make the first implementation better?
I general if I choose to go with the subclassing pattern why do I need to specify additional arguments for the _keras_build_fn other than meta?

from scikeras.wrappers import KerasRegressor


class MLPRegressor(KerasRegressor):

    def __init__(self, hidden_layer_sizes=(100, )):
        self.hidden_layer_sizes = hidden_layer_sizes

    def _keras_build_fn(self, meta):
        """Dynamically build regressor."""
 
        hidden_layer_sizes = self.hidden_layer_sizes

        model = Sequential()
        model.add(Dense(meta["X_shape_"][1], activation="relu", input_shape=meta["X_shape_"][1:]))
        for size in hidden_layer_sizes:
            model.add(Dense(size, activation="relu"))
        model.add(Dense(meta["n_outputs_"]))
        model.compile("adam", loss=KerasRegressor.r_squared)
        return model

Many thanks

DOC: ensure all docstrings correctly auto-generate into rst docs

REF: move warm_start to init

Mirrors scikit-learn estimators: https://github.com/scikit-learn/scikit-learn/blob/0fb307bf3/sklearn/linear_model/_stochastic_gradient.py

It probably makes sense to make a _fit method, which will also help accommodate the use of epochs (which should always be 1 for partial_fit).

[feature request] add keras.fit parameters as attributes to BaseWrapper

The keras.fit signature only has 18 parameters. Why not include them as parameters to BaseWrapper.init? They can already be passed as model.fit keyword args.

from sklearn.base import BaseEstimator

class BaseWrapper(BaseEstimator):
    def __init__(build_fn=None, verbose=1, callbacks=None, ...):
        self.verbose = verbose
        self.callbacks = callbacks
        ...

This implementation would (eventually) help enhance the documentation.

Minimal Requirments does not working

Hi Adrian,

i just find out, that I need scikit-learn>=0.22 to run scikeras.
I think the function _check_sample_weight was only introduced in version 0.22

Here my test:

Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
>>> import sklearn
>>> sklearn.__version__
'0.21.0'
>>> from scikeras.wrappers import KerasRegressor

Here the stacktrace:

File "C:\...\scikeras\wrappers.py", line 13, in <module>
    from sklearn.utils.validation import (
ImportError: cannot import name '_check_sample_weight' from 'sklearn.utils.validation' (C:\...\sklearn\utils\validation.py)

It works for me with scikit-learn>=0.22

Keras pickle RFC

I'd like to help with the Keras pickle RFC that's mentioned in tensorflow/tensorflow#39609 (comment). I see you have a repo https://github.com/adriangb/community. Could you create a PR internal to that repo to provide a channel to work on the draft RFC?

Problem with KerasClassifier

Hi @adriangb,
congratulations for the work.

I am having a problem with a dataset that I am using in my work.
KerasRegressor is working perfectly with me, but KerasClassifier presented a problem that did not happen with keras.wrappers.scikit_learn.

I took a simple example with the mnist dataset and the problem persisted.

If I change the import scikeras to keras.wrappers.scikit_learn the code works perfectly.

import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten, Dropout

#from keras.wrappers.scikit_learn import KerasClassifier
from scikeras.wrappers import KerasClassifier

from sklearn.model_selection import GridSearchCV

#load dataset
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

X = np.vstack((x_train, x_test)) 
y = np.hstack((y_train, y_test))

def build_model(optimizer='adam'):
    model = Sequential()
    model.add(Flatten(input_shape=(28, 28)))
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.2))
    model.add(Dense(10, activation='softmax'))

    model.compile(optimizer=optimizer,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

    return model
    
batch_size = [32, 256]
optimizer = ['adam', 'rmsprop']

param_grids = dict(batch_size = batch_size, optimizer = optimizer)
model = KerasClassifier(build_fn=build_model, verbose=1, batch_size=None, optimizer=None)

grid = GridSearchCV(estimator=model, param_grid=param_grids, n_jobs = 10)
result = grid.fit(X, y)

print("Best: {} using {}".format(result.best_score_, result.best_params_))

1750/1750 [==============================] - 3s 2ms/step - loss: 0.2233 - accuracy: 0.9342
438/438 [==============================] - 0s 562us/step
Traceback (most recent call last):
  File "mnist.py", line 39, in <module>
    result = grid.fit(X, y)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/utils/validation.py", line72, in inner_f
    return f(**kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 736, in fit
    self._run_search(evaluate_candidates)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 1188, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 708, in evaluate_candidates
    out = parallel(delayed(_fit_and_score)(clone(base_estimator),
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/parallel.py", line 1048, in__call__
    if self.dispatch_one_batch(iterator):
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 560, in _fit_and_score
    test_scores = _score(estimator, X_test, y_test, scorer)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 607, in _score
    scores = scorer(estimator, X_test, y_test)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/metrics/_scorer.py", line 90, in __call__
    score = scorer(estimator, *args, **kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/metrics/_scorer.py", line 372, in _passthrough_scorer
    return estimator.score(*args, **kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/scikeras/wrappers.py", line 653, in score
    y_pred = self.predict(X, **kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/scikeras/wrappers.py", line 617, in predict
    y, _ = self._post_process_y(y_pred)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/scikeras/wrappers.py", line 887, in _post_process_y
    self.encoders_[i].inverse_transform(y_)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/preprocessing/_label.py", line 293, in inverse_transform
    y = column_or_1d(y, warn=True)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/utils/validation.py", line72, in inner_f
    return f(**kwargs)
  File "/home/thiago.cavalcante/anaconda3/envs/my_env_conda/lib/python3.8/site-packages/sklearn/utils/validation.py", line845, in column_or_1d
    raise ValueError(
ValueError: y should be a 1d array, got an array of shape (14000, 10) instead.

Shapes of X and y:
X (70000, 28, 28)
y (70000,)

Thanks a lot for attention.

ENH: Simplified API for Data Conversions

SciKeras manipulates input data (X and/or y) to make it agree with what Keras expects and to allow users to implement multi input/output models.

I'm opening this issue to centralize discussion surrounding the data transformation API. As discussed in #78 and #79, this API is convoluted and could use improvement. In principle, this is because we do not know anything about the loss function until the Model is compiled, but determining what loss to use may require information about the input data. We need to break this cyclical dependency if we want a clear API. I can think of several ways to do this:

interpret loss=None as "please select an appropriate loss for me given this input data". Data reshaping to match the loss function (ex: one-hot encoding for categorical_crossentropy) will only happen when SciKeras is allowed to determine the loss function.
KerasClassifier/Regressor (but not BaseWrapper) only work with the new model compiling API (#66). This allows the user to select the loss function but we can use information from that to manipulate the input data since we have it before calling model_build_fn.

Both of these initiatives could help address #66 (review).

Neither of these options will likely work for advanced cases (like custom losses or multi input/output models) but in those cases, the users should be advanced enough to handle customizing the data pre/post processing themselves.

Separately, I feel that we should combine {pre,post}process_{X,y}, _check_output_model_compatibility and _utils.LabelDimensionTransformer into a single transformer. We can make these transformers __init__ params for the wrappers (which would allow on-the-fly overriding) or class attributes (which would be similar to the current subclassing interface).

class DataTransformerBase(ABC, TransformerMixin, BaseEstimator):

    def get_meta_params(self) -> Dict[str, Any]:   # allows retrieval of `n_classes_` and such
        return {...}

    def fit(self, x: ndarray) -> None:
        return

    def transform(self, x: ndarray) -> Union[ndarray, Iterable[ndarray], Dict[ndarray]]:
        return x
    
    def inverse_transform(self, x: Union[ndarray, Iterable[ndarray], Dict[ndarray]]) -> ndarray:
        return np.column_stack(x)

class KerasClassifierFeatureTransformer(DataTransformerBase):
    
    def __init__(self, loss: Union[Dict[str, Loss], Union[Loss, Iterable[Loss]]]):
        self.loss = loss

    def fit(self, x: ndarray) -> None:
        ...

    def transform(self, x: ndarray) -> Union[ndarray, Iterable[ndarray], Dict[ndarray]]:
        ...
    
    def inverse_transform(self, x: Union[ndarray, Iterable[ndarray], Dict[ndarray]]) -> ndarray:
        ...

KerasClassifier(
    model=get_model,
    loss=None,
    feature_transformer=KerasClassifierFeatureTransformer
)

I'm open to any input on these intiatives.

ENH: implement run_eagerly

This will require some tests, especially serialization and deserialization.

Requirements for SciKeras v0.2.0

This issue tracks what's required for SciKeras v0.2.0. Here are some of the more significant changes (which are possibly in progress):

Required:

Documentation: #58, #73
- Add doc page describing "unknown target type" as in https://github.com/adriangb/scikeras/pull/88/files#r503418037
Check to make sure model compiled correctly. (#86)

Nice to haves:

class_weights parameter: #52
classes param for partial_fit: #69
#51 provide an epochs parameter
Add more tests using sklearn classifiers/regressors that check output shapes and dtypes for different types of target types and target dtypes. Similar to the current test_multilabel_classification test.
#66: globbing / param groups for routed params

Completed as part of this release:

Add Keras parameters to BaseWrapper.init (loss, optimizer, etc) (#47, #55).
Remove needless checks/array creation (#63, #59)
Make pre/post processing functions public (#42).
Some stability around BaseWrapper.__call__ (#35).
Cleanup around loss names (#38, #35).
Parameter routing (#67)
Rename build_fn to model (with deprecation cycle).
Check to make sure model compiled correctly. (#86, #100 pending #88)
Compile if uncompiled model is returned (#66).

Model arguments aren't translated into actual classes

Let's say I use this code:

est = KerasRegressor(
    model=model,
    model__foo=Foo,  # user defined class
    model__foo__bar=Bar,  # user defined class
    model__foo__bar__x=5,
)

I would expect model(foo=Foo(Bar(x=5))) to be called to build the model. Instead, I get a ValueError because model doesn't accept argument foo__bar.

Complete example:

import tensorflow as tf
from scikeras.wrappers import KerasRegressor

class Foo:
    def __init__(self, bar):
        self.bar = bar

class Bar:
    def __init__(self, x=3):
        self.x = x

def model(foo: Foo):
    inputs = tf.keras.Input(shape=(100,))
    x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(inputs)
    outputs = tf.keras.layers.Dense(1, activation=tf.nn.softmax)(x)
    model = tf.keras.Model(inputs=inputs, outputs=outputs)
    return model

est = KerasRegressor(
    model=model,
    model__foo=Foo,
    model__foo__bar=Bar,
    model__foo__bar__x=5,
)

ENH: implement handling of class_weights param

Handling of input transformation

The previous wrappers that lived in the tensorflow repo automatically converted from object or ordinal targets to one-hot encoded targets if the loss function was categorical crossentropy. We are doing the same here, but given that we now have access to OneHotEncoder and OrdinalEncoder I think it would be good to move away from numpy indexing stufff to a cleaner implementation.

Implement `classes` param for partial fit

As per MLPClassifiers docs for partial_fit (and similar in other sklearn estimators):

Classes across all calls to partial_fit. Can be obtained via np.unique(y_all), where y_all is the target vector of the entire dataset. This argument is required for the first call to partial_fit and can be omitted in the subsequent calls. Note that y doesn’t need to contain all labels in classes.

So the task for us is going to be:

Accept classes argument
Switch from LabelEncoder to OrdinalEncoder (which accepts a classes parameter).
Patch that parameter into any other encoders (ex: OneHotEncoder).

I think Keras models support subsequent fit calls with new classes? But I'm not sure.

DataConversionWarning: column-vector passed when 1d array expected

I'm having difficulty figuring this error out. Here's the setup:

I have a script that uses keras.datasets.mnist. It reshapes into a 2D array and a 1D vector of labels:

import tensorflow as tf
from tensorflow.keras.layers import Dense, Activation, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.datasets import mnist
from scikeras.wrappers import KerasClassifier

def get_keras():
    (X_train, y_train), _ = mnist.load_data()
    X_train, y_train = X_train[:100], y_train[:100]
    X_train = X_train.reshape(X_train.shape[0], 784).astype("float32") / 255
    return X_train, y_train

def _keras_build_fn(opt="sgd"):
    model = Sequential([Dense(512, input_shape=(784,)), Activation("relu"),
                        Dense(10), Activation("softmax")])
    model.compile(loss="binary_crossentropy", optimizer=opt, metrics=["accuracy"])
    return model

def test_keras():
    X, y = get_keras()
    assert X.ndim == 2 and X.shape[-1] == 784 and y.ndim == 1
    model = KerasClassifier(build_fn=_keras_build_fn)
    model.fit(X, y)

if __name__ == "__main__":
    test_keras()

I run this script with python _scikeras_bug.py. Everything works.
I run this script with pytest _scikeras_bug.py, and I get this error:

sklearn.exceptions.DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().

Here's the traceback:

_scikeras_bug.py:36:
    model.fit(X, y)
../scikeras/scikeras/wrappers.py:503: in fit
    y, extra_args = self._pre_process_y(y)
../scikeras/scikeras/wrappers.py:742: in _pre_process_y
    y = encoder.fit_transform(y)
../../../anaconda3/envs/dask-ml-docs/lib/python3.6/site-packages/sklearn/preprocessing/_label.py:255: in fit_transform
    y = column_or_1d(y, warn=True)
../../../anaconda3/envs/dask-ml-docs/lib/python3.6/site-packages/sklearn/utils/validation.py:73: in inner_f
    return f(**kwargs)../../../anaconda3/envs/dask-ml-docs/lib/python3.6/site-packages/sklearn/utils/validation.py:843: DataConversionWarning

I've tried this out with two environments, and got the same error both times:

pytest 5.4.3, scikeras 0.1.7, Tensorflow 2.2.0, Python 3.6.10
pytest 6.0.0rc1, scikeras 0.1.7, Tensorflow 2.2.0, Python 3.8.3.

Model building function arguments not clearly documented

The arguments for build_fn are not clearly documented. For example, it's not clear where the required arguments below come from:

scikeras/tests/mlp_models.py

Lines 8 to 11 in cf7ea78

    
           def dynamic_classifier( 
        
               n_features_in_, 
        
               cls_type_, 
        
               n_classes_,

I don't see any place the user specified the number of features or number of outputs. This is especially confusing because the user typically knows the required arguments ahead of time because they have a dataset and problem in mind.

The docstring for BaseWrapper._fit_build_keras_model says the following:

scikeras/scikeras/wrappers.py

Lines 155 to 156 in cf7ea78

    
                   This method will process all arguments and call the model building 
        
                   function with appropriate arguments.

How the "appropriate arguments" determined?

ENH: implement handling of `epochs` param

We need to discuss:

Naming of the parameter. Keras uses epochs, sklearn uses max_iter.
Default value of the parameter. Keras uses epochs=1, skelarn classifiers vary:
- SGDClassifier: 1000
- MLPClassifier: 200
- SVC: -1 (no limit, tolerance based stopping only).
Use of initial_epoch. Keras has no centralized tracking of epochs. It seems that initial_epoch can be used for stateless optimizers that depend on the epoch to adjust internal parameters like the learning rate (SO explanation). We could enable the parameter, or hide it from the user since we do keep track of the epochs as the length of the values in BaseWrapper.history_ (i.e. self.model_.fit(..., initial_epoch=len(self.history_.loss)))

Keras.root_mean_squared_error is misnamed

The documentation of KerasRegressor.root_mean_square_error:

scikeras/scikeras/wrappers.py

Lines 997 to 1002 in d9f5833

    
               def root_mean_squared_error(y_true, y_pred): 
        
                   """A simple Keras implementation of R^2 that can be used as a Keras 
        
                   loss function. 
        
                   Since ScikitLearn's `score` uses R^2 by default, it is 
        
                   advisable to use the same loss/metric when optimizing the model.

"R^2" is the coefficient of determination, not the (root) mean squared error (MSE).

The implementation even mirrors the Wikipedia definition of R^2 down to the names and subscripts:

scikeras/scikeras/wrappers.py

Lines 1004 to 1010 in d9f5833

    
           ss_res = k_backend.sum(k_backend.square(y_true - y_pred), axis=0) 
        
           ss_tot = k_backend.sum( 
        
               k_backend.square(y_true - k_backend.mean(y_true, axis=0)), axis=0 
        
           ) 
        
           return k_backend.mean( 
        
               1 - ss_res / (ss_tot + k_backend.epsilon()), axis=-1 
        
           )

Compatibility with Dask

Originally reported by @stsievert in #19:

I can run a Scikit-learn model selection search, but am having a difficult time running a distributed Dask-ML model selection search.

from dask.distributed import Client
from dask_ml.model_selection import HyperbandSearchCV
if __name__ == "__main__":
    # X, y, model are as above
    client = Client()
    search = IncrementalSearchCV(model2, params, max_iter=5)
    with pytest.warns(DataConversionWarning):
        search.fit(X, y)  # fails on this line

When I run this code, I get several errors error:

"TypeError: Expected float32 passed to parameter 'b' of op 'MatMul', got <tf.Variable 'dense_1_2/kernel:0' shape=(512, 10) dtype=float32> of type 'ResourceVariable' instead. Error: 'NoneType' object is not iterable" at tensorflow/python/framework/op_def_library.py", line 475, in _apply_op_helper
TypeError: 'NoneType' object is not iterable at tensorflow/python/framework/func_graph.py", line 418, in inner_cm
DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). at sklearn/utils/validation.py:73:

Here's the full output:

$  python _scikeras_bug.py
...
/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  return f(**kwargs)
/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  return f(**kwargs)
/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  return f(**kwargs)
/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  return f(**kwargs)
2020-07-15 15:07:29.807015: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-15 15:07:29.813243: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-15 15:07:29.818382: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-15 15:07:29.819346: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8a7c1f1fe0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-15 15:07:29.819367: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-15 15:07:29.826941: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8b0b201b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-15 15:07:29.826962: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-15 15:07:29.837069: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fdc87b75290 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-15 15:07:29.837096: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/sklearn/utils/validation.py:73: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  return f(**kwargs)
2020-07-15 15:07:29.916084: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
distributed.utils - ERROR - 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/utils.py", line 656, in log_errors
    yield
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 103, in _partial_fit
    model.partial_fit(X, y, **(fit_params or {}))
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 547, in partial_fit
    return self.fit(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 511, in fit
    self.model_ = self._build_keras_model(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 304, in _build_keras_model
    model = final_build_fn(**build_args)
  File "_scikeras_bug.py", line 36, in _keras_build_fn
    model = Sequential(layers)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 129, in __init__
    self.add(layer)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 213, in add
    output_tensor = layer(self.outputs[0])
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 960, in __call__
    self._set_inputs(cast_inputs, outputs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 418, in inner_cm
    for fn in self._scope_exit_callbacks:
TypeError: 'NoneType' object is not iterable
distributed.utils - ERROR - 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/utils.py", line 656, in log_errors
    yield
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 103, in _partial_fit
    model.partial_fit(X, y, **(fit_params or {}))
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 547, in partial_fit
    return self.fit(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 511, in fit
    self.model_ = self._build_keras_model(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 304, in _build_keras_model
    model = final_build_fn(**build_args)
  File "_scikeras_bug.py", line 36, in _keras_build_fn
    model = Sequential(layers)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 129, in __init__
    self.add(layer)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 213, in add
    output_tensor = layer(self.outputs[0])
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 960, in __call__
    self._set_inputs(cast_inputs, outputs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 418, in inner_cm
    for fn in self._scope_exit_callbacks:
TypeError: 'NoneType' object is not iterable
2020-07-15 15:07:29.932341: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fb0b125cc00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-15 15:07:29.932392: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
distributed.worker - WARNING -  Compute Failed
Function:  execute_task
args:      ((<function _partial_fit at 0x14b23d550>, (KerasClassifier(
        build_fn=<function _keras_build_fn at 0x14b465160>
        lr=0.0018805049416974935
), {'model_id': 2, 'params': {'lr': 0.0018805049416974935}, 'partial_fit_calls': 0}), array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32), array([4, 0, 2, 1, 3, 8, 5, 0, 3, 1, 2, 7, 9, 1, 3, 4, 7, 7, 1, 8, 6, 3,
       0, 9, 9, 4, 4, 5, 9, 0, 7, 6, 7, 6, 0, 0, 6, 8, 6, 3, 1, 9, 8, 1,
       7, 5, 3, 8, 6, 4, 4, 9, 4, 2, 9, 9, 2, 0, 4, 0, 7, 1, 3, 7, 1, 1,
       6, 3, 1, 2, 8, 6, 5, 2, 4, 7, 9, 9, 0, 8], dtype=uint8), (<class 'dict'>, [])))
kwargs:    {}
Exception: TypeError("'NoneType' object is not iterable")

distributed.utils - ERROR - Expected float32 passed to parameter 'b' of op 'MatMul', got <tf.Variable 'dense_1_2/kernel:0' shape=(512, 10) dtype=float32> of type 'ResourceVariable' instead. Error: 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 465, in _apply_op_helper
    values = ops.convert_to_tensor(
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1341, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1825, in _dense_var_to_tensor
    return var._dense_var_to_tensor(dtype=dtype, name=name, as_ref=as_ref)  # pylint: disable=protected-access
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1242, in _dense_var_to_tensor
    return self.value()
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 550, in value
    return self._read_variable_op()
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 645, in _read_variable_op
    result = read_and_set_handle()
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 635, in read_and_set_handle
    result = gen_resource_variable_ops.read_variable_op(self._handle,
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 482, in read_variable_op
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in _apply_op_helper
    return output_structure, op_def.is_stateful, op, outputs
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 418, in inner_cm
    for fn in self._scope_exit_callbacks:
TypeError: 'NoneType' object is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/utils.py", line 656, in log_errors
    yield
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 103, in _partial_fit
    model.partial_fit(X, y, **(fit_params or {}))
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 547, in partial_fit
    return selTraceback (most recent call last):
  File "_scikeras_bug.py", line 65, in <module>
    search.fit(X, y)  # fails on this line
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 1006, in fit
f.fit(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 511, in fit
    self.model_ = self._build_keras_model(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 304, in _build_keras_model
    model = final_build_fn(**build_args)
  File "_scikeras_bug.py", line 36, in _keras_build_fn
    model = Sequential(layers)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 129, in __init__
    self.add(layer)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensor    return super(IncrementalSearchCV, self).fit(X, y=y, **fit_params)
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 695, in fit
    return client.sync(self._fit, X, y, **fit_params)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/client.py", line 831, in sync
flow/python/keras/engine/sequential.py", line 213, in add
    output_tensor = layer(self.outputs[0])
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 922, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/layers/core.py", line 1194, in call
    outputs = gen_math_ops.mat_mul(inputs, self.kernel)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5585, in mat_mul
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 475, in _apply_op_helper
    raise TypeError(
TypeError: Expected float32 passed to parameter 'b' of op 'MatMul', got <tf.Variable 'dense_1_2/kernel:0' shape=(512, 10) dtype=float32> of type 'ResourceVariable' instead. Error: 'NoneType' object is not iterable
    return sync(
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/utils.py", line 339, in sync
    raise exc.with_traceback(tb)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/utils.py", line 323, in f
    result[0] = yield future
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tornado/gen.py", line 735, in run
    value = future.result()
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 641, in _fit
    results = await fit(
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 457, in fit
    return await _fit(
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 253, in _fit
    metas = await client.gather(new_scores)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/distributed/client.py", line 1841, in _gather
    raise exception.with_traceback(traceback)
  File "/Users/scott/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py", line 103, in _partial_fit
    model.partial_fit(X, y, **(fit_params or {}))
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 547, in partial_fit
    return self.fit(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 511, in fit
    self.model_ = self._build_keras_model(
  File "/Users/scott/Developer/stsievert/scikeras/scikeras/wrappers.py", line 304, in _build_keras_model
    model = final_build_fn(**build_args)
  File "_scikeras_bug.py", line 36, in _keras_build_fn
    model = Sequential(layers)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 129, in __init__
    self.add(layer)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 213, in add
    output_tensor = layer(self.outputs[0])
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 960, in __call__
    self._set_inputs(cast_inputs, outputs)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/Users/scott/anaconda3/envs/dask-ml-test/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 418, in inner_cm
    for fn in self._scope_exit_callbacks:
TypeError: 'NoneType' object is not iterable

Here's a reproducible example:

import numpy as np
import pytest
import pickle
from typing import Tuple
from sklearn.exceptions import DataConversionWarning
from dask.distributed import Client

import tensorflow as tf
from scikeras.wrappers import KerasClassifier
from tensorflow.keras.datasets import mnist as keras_mnist
from tensorflow.keras.layers import Activation, Dense, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical

from sklearn.model_selection import RandomizedSearchCV
from dask_ml.model_selection import IncrementalSearchCV
from scipy.stats import loguniform


def mnist() -> Tuple[np.ndarray, np.ndarray]:
    (X_train, y_train), _ = keras_mnist.load_data()
    X_train = X_train[:100]
    y_train = y_train[:100]
    X_train = X_train.reshape(X_train.shape[0], 784)
    X_train = X_train.astype("float32")
    X_train /= 255
    Y_train = to_categorical(y_train, 10)
    return X_train, y_train


def _keras_build_fn(lr=0.01):
    layers = [
        Dense(512, input_shape=(784,), activation="relu"),
        Dense(10, input_shape=(512,), activation="softmax"),
    ]
    model = Sequential(layers)

    opt = tf.keras.optimizers.SGD(learning_rate=lr)
    model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
    return model


if __name__ == "__main__":
    X, y = mnist()
    assert X.ndim == 2 and X.shape[-1] == 784
    assert y.ndim == 1 and len(X) == len(y)
    assert isinstance(X, np.ndarray) and isinstance(y, np.ndarray)

    model = KerasClassifier(build_fn=_keras_build_fn, lr=0.1)
    params = {"lr": loguniform(1e-3, 1e-1)}
    model2 = pickle.loads(pickle.dumps(model))

    with pytest.warns(DataConversionWarning):
        m = model.partial_fit(X, y)
    assert m is model

    search = RandomizedSearchCV(model, params, n_iter=3)
    with pytest.warns(DataConversionWarning):
        search.fit(X, y, epochs=2)
    assert search.best_score_ >= 0

    client = Client()
    search = IncrementalSearchCV(model2, params, max_iter=5)
    with pytest.warns(DataConversionWarning):
        search.fit(X, y)  # fails on this line
    assert search.best_score_ >= 0

BUG: can't grid search optimizers

It turns out that Keras optimizers are not picklable because they use lambdas inside functions 🤦

This can be fixed by using tf.keras.optimizers.serialize and tf.keras.optimizers.deserialize.

I think that maybe we need to list out all of the keras object types Model, optimizers, callbacks, etc. and use copyreg.pickle to register how to pickle each one using their custom keras serialize/deserialize methods.

Dask HyperbandSearchCV and Keras Metrics crashes

Hi Adrian,
There is also a problem with the keras metrics and the HyperbandSearchCV from dask-ml. I think this happens because there are no metrics in history on the first warm_start .

Code:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten
from scikeras.wrappers import KerasRegressor

import numpy as np
from dask.distributed import Client, LocalCluster

import dask_ml.model_selection as dcv


def model_building_function(X, n_outputs_, hidden_layer_sizes, activation_func):
    """Dynamically build regressor."""

    model = Sequential()
    model.add(Flatten(input_shape=X.shape[1:]))
    for size in hidden_layer_sizes:
        model.add(Dense(size, activation=activation_func))
    model.add(Dense(n_outputs_))
    model.compile("adam", loss="mean_squared_error",
                  metrics=['mse', 'mae', 'logcosh']
                  )
    return model


class Est(KerasRegressor):
    def __init__(self, hidden_layer_sizes=None, activation_func=None):
        self.hidden_layer_sizes = hidden_layer_sizes
        self.activation_func = activation_func
        super().__init__()

    """
    def __call__(self, X, n_outputs_, hidden_layer_sizes, activation_func):
        model = model_building_function(
            X, n_outputs_, hidden_layer_sizes, activation_func
        )
        return model

    """
    def __call__(self, X, n_outputs_, hidden_layer_sizes, activation_func):
        print('Est args:', X, n_outputs_, hidden_layer_sizes, activation_func)

        model = model_building_function(
            X, n_outputs_, hidden_layer_sizes, activation_func
        )
        return model



if __name__ == "__main__":
    cluster = LocalCluster(processes=True, n_workers=5, threads_per_worker=1)
    # cluster = LocalCluster(processes=False, n_workers=1, threads_per_worker=1)

    dask_client = Client(cluster)
    #dask_client = Client('localhost:8786', timeout=10)



    model = KerasRegressor( build_fn=model_building_function,
                            hidden_layer_sizes=[32, 32], activation_func="relu")

    #model = Est(hidden_layer_sizes=[32, 32], activation_func="relu")
    #model = KerasRegressor(build_fn=Est(), hidden_layer_sizes=[32, 32], activation_func="relu")

    # generate data
    X = np.arange(1.0, 5001.0).reshape((100, 50)).astype(np.float32)
    y = np.arange(1.0, 101.0).reshape(100, 1).astype(np.float32)

    # specify parameters and distributions to sample from
    param_dist = {
        "hidden_layer_sizes": [[16, 16], [32, 32], [64, 64], [128, 128], [256, 256]],
        "activation_func": ["relu", "elu"],
    }

    hyperband = dcv.HyperbandSearchCV(model, param_dist, max_iter=12, aggressiveness=3,verbose=True)

    print("START HyperbandSearchCV")
    hyperband.fit(X, y)
    print("best score =", hyperband.best_score_)
    assert hasattr(hyperband, "best_score_")

Traceback

Traceback (most recent call last):
  File "D:\...\site-packages\distributed\utils.py", line 665, in log_errors
    yield
  File "D:\...\lib\site-packages\dask_ml\model_selection\_incremental.py", line 102, in _partial_fit
    model.partial_fit(X, y, **(fit_params or {}))
  File "D:\...\lib\site-packages\scikeras\wrappers.py", line 577, in partial_fit
    X, y, sample_weight=sample_weight, warm_start=True, **kwargs
  File "D:\...\lib\site-packages\scikeras\wrappers.py", line 548, in fit
    X, y, sample_weight=sample_weight, warm_start=warm_start, **kwargs
  File "D:\...\lib\site-packages\scikeras\wrappers.py", line 330, in _fit_keras_model
    k: self.history_[k] + hist.history[k] for k in keys
  File "D:\...\site-packages\scikeras\wrappers.py", line 330, in <dictcomp>
    k: self.history_[k] + hist.history[k] for k in keys
KeyError: 'log_cosh'

ENH: support setting of random states seeds

There are two reasons to do this:

As a feature: the ability to set the a random state for wrapped models. The catch here is that this would only really be a feature if it can be done on a per-model basis, like is the case for ScikitLearn estimators. I'm not sure if convoluting that with setting a global random state is worth it.
For testing: many of ScikitLearn's estimator tests rely on reproducible deterministic results. Using a global random state for this may be appropriate since we only use one model at a time and are okay with the side-effect of a global seed.

Relevant discussions:

MAINT: Use scikeras.KerasRegressor.r_squared as loss function or catch/assert warning in all tests calling score

MAINT: minimize warnings

Review all warnings emerging from tests and try to minimize them.
Block "known" warnings (i.e. ConversionWarning or warnings from tests that are supposed to fail).
Avoid reshaping y when unneeded or add reshaping when the Keras output does not match the input to minimize DataConversionWarning s.

DOC: Transfer learning notebook

Add a notebook that:

Replicates https://www.tensorflow.org/tutorials/images/transfer_learning
Clarifies the relationship between partial_fit, fit, warm_start and epochs/initial_epochs

Make partial_fit force a single iteration

Currently partial_fit and fit + warm_start have the same behavior. I think it would be more useful if we followed SGD's example and trained for exactly one epoch when partial_fit is called, regardless of the value of epochs/n_iter passed to __init__.

BUG: kwargs cannot be passed to fit method

As explained here

https://github.com/adriangb/scikeras#passing-arguments-to-keras-methods

I should be able to pass for example epochs directly to the fit method, but this raises an error:

TypeError: fit() got an unexpected keyword argument 'epochs'

I can see in the source code that in fact fit doesn't accept keyword arguments.

Estimator definition for Dask HyperbandSearchCV

Hi Adrian,
thanks for developing and helping with scikeras.

I made this example code (#24 (comment)) run on my PC.
But I still have a question respectively a problem.

There are 3 waysof creating a dynamically built model:

1. Passing a callable function
model = KerasRegressor(build_fn=model_building_function, hidden_layer_sizes=[32, 32], activation_func="relu")
This works with the given example

Code

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten
from scikeras.wrappers import KerasRegressor

import numpy as np
from dask.distributed import Client, LocalCluster

import dask_ml.model_selection as dcv


def model_building_function(X, n_outputs_, hidden_layer_sizes, activation_func):
    """Dynamically build regressor."""

    model = Sequential()
    model.add(Flatten(input_shape=X.shape[1:]))
    for size in hidden_layer_sizes:
        model.add(Dense(size, activation=activation_func))
    model.add(Dense(n_outputs_))
    model.compile("adam", loss="mean_squared_error")
    return model


class Est(KerasRegressor):
    def __init__(self, hidden_layer_sizes=None, activation_func=None):
        self.hidden_layer_sizes = hidden_layer_sizes
        self.activation_func = activation_func
        super().__init__()

    def __call__(self, X, n_outputs_, hidden_layer_sizes, activation_func):
        model = model_building_function(
            X, n_outputs_, hidden_layer_sizes, activation_func
        )
        return model


if __name__ == "__main__":
    cluster = LocalCluster(processes=True, n_workers=5, threads_per_worker=1)
    #  cluster = LocalCluster(processes=False, n_workers=1, threads_per_worker=1)

    dask_client = Client(cluster)

    model = KerasRegressor(build_fn=model_building_function, hidden_layer_sizes=[32, 32], activation_func="relu")
    #  model = KerasRegressor(build_fn=Est(), hidden_layer_sizes=[32, 32], activation_func="relu")
    #  model = Est(hidden_layer_sizes=[32, 32], activation_func="relu")

    # generate data
    X = np.arange(1.0, 5001.0).reshape((100, 50)).astype(np.float32)
    y = np.arange(1.0, 101.0).reshape(100, 1).astype(np.float32)

    # specify parameters and distributions to sample from
    param_dist = {
        "hidden_layer_sizes": [[16, 16], [32, 32], [64, 64], [128, 128], [256, 256]],
        "activation_func": ["relu", "elu"],
    }

    hyperband = dcv.HyperbandSearchCV(model, param_dist, max_iter=12)

    print("START HyperbandSearchCV")
    hyperband.fit(X, y)
    print("best score =", hyperband.best_score_)
    assert hasattr(hyperband, "best_score_")

2. Instance of a class implementing call as the build_fn parameter
model = KerasRegressor(build_fn=Est(), hidden_layer_sizes=[32, 32], activation_func="relu")
This works with the given example.

Code

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten
from scikeras.wrappers import KerasRegressor

import numpy as np
from dask.distributed import Client, LocalCluster

import dask_ml.model_selection as dcv


def model_building_function(X, n_outputs_, hidden_layer_sizes, activation_func):
    """Dynamically build regressor."""

    model = Sequential()
    model.add(Flatten(input_shape=X.shape[1:]))
    for size in hidden_layer_sizes:
        model.add(Dense(size, activation=activation_func))
    model.add(Dense(n_outputs_))
    model.compile("adam", loss="mean_squared_error")
    return model


class Est(KerasRegressor):
    def __init__(self, hidden_layer_sizes=None, activation_func=None):
        self.hidden_layer_sizes = hidden_layer_sizes
        self.activation_func = activation_func
        super().__init__()

    def __call__(self, X, n_outputs_, hidden_layer_sizes, activation_func):
        model = model_building_function(
            X, n_outputs_, hidden_layer_sizes, activation_func
        )
        return model


if __name__ == "__main__":
    cluster = LocalCluster(processes=True, n_workers=5, threads_per_worker=1)
    #  cluster = LocalCluster(processes=False, n_workers=1, threads_per_worker=1)

    dask_client = Client(cluster)

    #  model = KerasRegressor(build_fn=model_building_function, hidden_layer_sizes=[32, 32], activation_func="relu")
    model = KerasRegressor(build_fn=Est(), hidden_layer_sizes=[32, 32], activation_func="relu")
    #  model = Est(hidden_layer_sizes=[32, 32], activation_func="relu")

    # generate data
    X = np.arange(1.0, 5001.0).reshape((100, 50)).astype(np.float32)
    y = np.arange(1.0, 101.0).reshape(100, 1).astype(np.float32)

    # specify parameters and distributions to sample from
    param_dist = {
        "hidden_layer_sizes": [[16, 16], [32, 32], [64, 64], [128, 128], [256, 256]],
        "activation_func": ["relu", "elu"],
    }

    hyperband = dcv.HyperbandSearchCV(model, param_dist, max_iter=12)

    print("START HyperbandSearchCV")
    hyperband.fit(X, y)
    print("best score =", hyperband.best_score_)
    assert hasattr(hyperband, "best_score_")

3. Subclass the wrapper and implement call in your class.
model = Est(hidden_layer_sizes=[32, 32], activation_func="relu")
This does not work with the given example.

Code

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten
from scikeras.wrappers import KerasRegressor

import numpy as np
from dask.distributed import Client, LocalCluster

import dask_ml.model_selection as dcv


def model_building_function(X, n_outputs_, hidden_layer_sizes, activation_func):
    """Dynamically build regressor."""

    model = Sequential()
    model.add(Flatten(input_shape=X.shape[1:]))
    for size in hidden_layer_sizes:
        model.add(Dense(size, activation=activation_func))
    model.add(Dense(n_outputs_))
    model.compile("adam", loss="mean_squared_error")
    return model


class Est(KerasRegressor):
    def __init__(self, hidden_layer_sizes=None, activation_func=None):
        self.hidden_layer_sizes = hidden_layer_sizes
        self.activation_func = activation_func
        super().__init__()

    def __call__(self, X, n_outputs_, hidden_layer_sizes, activation_func):
        model = model_building_function(
            X, n_outputs_, hidden_layer_sizes, activation_func
        )
        return model


if __name__ == "__main__":
    cluster = LocalCluster(processes=True, n_workers=5, threads_per_worker=1)
    #  cluster = LocalCluster(processes=False, n_workers=1, threads_per_worker=1)

    dask_client = Client(cluster)


    #  model = KerasRegressor(build_fn=model_building_function, hidden_layer_sizes=[32, 32], activation_func="relu")
    #  model = KerasRegressor(build_fn=Est(), hidden_layer_sizes=[32, 32], activation_func="relu")
    model = Est(hidden_layer_sizes=[32, 32], activation_func="relu")

    # generate data
    X = np.arange(1.0, 5001.0).reshape((100, 50)).astype(np.float32)
    y = np.arange(1.0, 101.0).reshape(100, 1).astype(np.float32)

    # specify parameters and distributions to sample from
    param_dist = {
        "hidden_layer_sizes": [[16, 16], [32, 32], [64, 64], [128, 128], [256, 256]],
        "activation_func": ["relu", "elu"],
    }

    hyperband = dcv.HyperbandSearchCV(model, param_dist, max_iter=12)

    print("START HyperbandSearchCV")
    hyperband.fit(X, y)
    print("best score =", hyperband.best_score_)
    assert hasattr(hyperband, "best_score_")

I got the following error:
TypeError: __call__() missing 3 required positional arguments: 'n_outputs_', 'hidden_layer_sizes', and 'activation_func

Traceback

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm 2020.1.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2020.1.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/PycharmProjects/optimization_library/tests/testing.py", line 82, in <module>
    hyperband.fit(X, y)
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\dask_ml\model_selection\_incremental.py", line 702, in fit
    return client.sync(self._fit, X, y, **fit_params)
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\distributed\client.py", line 780, in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\distributed\utils.py", line 348, in sync
    raise exc.with_traceback(tb)
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\distributed\utils.py", line 332, in f
    result[0] = yield future
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\tornado\gen.py", line 735, in run
    value = future.result()
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\dask_ml\model_selection\_hyperband.py", line 402, in _fit
    *[SHAs[b]._fit(X, y, **fit_params) for b in _brackets_ids]
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\dask_ml\model_selection\_incremental.py", line 660, in _fit
    prefix=self.prefix,
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\dask_ml\model_selection\_incremental.py", line 478, in fit
    prefix=prefix,
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\dask_ml\model_selection\_incremental.py", line 262, in _fit
    metas = await client.gather(new_scores)
  File "D:\PycharmProjects\optimization_library\venv_dask_hyperband\lib\site-packages\distributed\client.py", line 1752, in _gather
    raise exception.with_traceback(traceback)
TypeError: __call__() missing 3 required positional arguments: 'n_outputs_', 'hidden_layer_sizes', and 'activation_func'

dask error

distributed.worker - WARNING -  Compute Failed
Function:  execute_task
args:      ((<function _partial_fit at 0x00000233830B81F8>, (Est(
	activation_func=elu
	hidden_layer_sizes=[128, 128]
), {'model_id': 1, 'params': {'hidden_layer_sizes': [128, 128], 'activation_func': 'elu'}, 'partial_fit_calls': 0}), array([[3301., 3302., 3303., ..., 3348., 3349., 3350.],
       [4451., 4452., 4453., ..., 4498., 4499., 4500.],
       [ 851.,  852.,  853., ...,  898.,  899.,  900.],
       ...,
       [2051., 2052., 2053., ..., 2098., 2099., 2100.],
       [  51.,   52.,   53., ...,   98.,   99.,  100.],
       [4851., 4852., 4853., ..., 4898., 4899., 4900.]], dtype=float32), array([[ 67.],
       [ 90.],
       [ 18.],
       [ 95.],
       [  5.],
       [ 23.],
       [100.],
       [ 82.],
       [ 48.],
       [ 76.],
       [ 57.],
       [ 36.],
       [ 77.],
       [ 47.],
       [ 94.],
       [ 19.],
       [ 27.],
       [ 80.],
       [  9.],
       [ 33.],
       [ 88.],
       [ 28.],
       [ 66.],
       [ 40.],
       [ 50.],
       [ 55.],
       [ 25.],
      
kwargs:    {}
Exception: TypeError("__call__() missing 3 required positional arguments: 'n_outputs_', 'hidden_layer_sizes', and 'activation_func'")

Thanks for help.

Clarification on loss for multiclass classifiers.

I noticed that when using sparse_categorical_crossentropy loss in a model with KerasClassifier, it fails when providing numeric labels. When transforming the output y, LabelEncoder expects it to be a numeric value as opposed to the one-hot encoding created in _post_process_y(). Normal Keras fitting and predicting will work in this case.

On the other hand, when the loss is categorical_crossentropy, providing numeric labels works becuase _check_output_model_compatibility() checks for categorical_crossentropy and makes the appropriate transformation to one-hot encoding. However, normal Keras fitting and predicting will not work in this case because it requires the labels to be one-hot encoded.

Example:

import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasClassifier


def build_model():
    model = Sequential()
    model.add(Dense(5, activation='softmax', input_shape=(20,)))
    model.compile(optimizer='adam', loss='categorical_crossentropy')
    return model


X = np.random.default_rng().random((100, 20))
y = np.random.default_rng().integers(5, size=(100,))

clf = KerasClassifier(build_model)
clf.fit(X, y)
y_pred = clf.predict(X)  # This will fail when using 'sparse_categorical_crossentropy'
print(y_pred.shape)

model = build_model()
model.fit(X, y)  # This will fail when using 'categorical_crossentropy'
model.predict(X)
y_pred = model.predict(X)
print(y_pred.shape)

This isn't a bug so much as something that perhaps needs clarification that you should use categorical_crossentropy, or have an additional check in _check_output_model_compatibility().

Target parameter required

In BaseWrapper.fit, the targets y are required.

scikeras/scikeras/wrappers.py

Line 556 in 9f633f5

def fit(self, X, y, sample_weight=None):

This poses challenges for models that don't require targets (like autoencoders).

MAINT: Rename build_fn to model in all tests

REF: Deprecate callable class as build_fn

Making this issue as a reminder to at least put in a deprecation warning for the callable class interface, as discussed in #37 .

MAINT/DOC: Documentation overhaul

I think we'll be making some pretty big changes in the near future. This is going to require a good amount of documentation re-write, which I don't expect to directly accompany every PR, so I'm opening this issue as a reminder that this will need to be done before the next release.

This might also be a good time to consider moving to Sphinx or something else more flexible than README.md.

RFC: time to simplify APIs?

Background

Currently, this package has many ways to:

Pass arguments to build_fn/_keras_build_fn.
Pass arguments to Keras Models' fit or predict.

This comes from a combination of supporting the original tf.keras.wrappers.scikit_learn interface along with introduction of new ways to do things to improve Scikit-Learn compability.

Important principles

Since this aims to be a small pure Python package, I think it is important to keep in mind some of Python's guiding principles (cherry picked from PEP20):

Explicit is better than implicit.

Simple is better than complex.

There should be one-- and preferably only one --obvious way to do it.

For the most part, the APIs mentioned above rely on dynamic parsing of function signatures and filtering of kwargs or attributes by name. This is somewhat "complex" and "implicit" in my opinion.

Next steps

I feel (and here is where I would appreciate some feedback from others) that it would be good to fully document the requirements that these wrappers have and then narrow down the API to be as simple as possible while still meeting all of those requirements. Off the top of my head:

Full compatibility with the Scikit-Learn API (this includes hyperparameter optimization type stuff).
Compatible with Sequential, Functional and Model subclass Keras APIs.
Ability to use pre-built Keras Models.
Ability to access wrapper parameters and attributes during building of a Keras Model.
Ability to pass arguments to the Keras Model's fit and predict methods.
- Is there a use case to pass parameters during a call to fit, or can we require that they be set from __init__? As far as I can tell, only the latter makes sense as far as Scikit-Learn is concerned.

Based on the above reqs (which admittedly could be shortsighted), it seems reasonable to me to take the following action:

Remove **kwargs from fit and predict and require that these parameters be set via __init__.
Remove **kwargs from __init__ and instead hardcode Keras Model fit and predict parameters (maybe also compile parameters?) as proposed in #30.

A more extreme step would be to remove the build_fn argument and force users to always use the subclassing interface since it is technically even possible to return a pre-built model from _keras_build_fn via a closure or other methods. This would greatly simplify the API but I worry that it would be an inconvenience for users (even if it is just a couple more lines of code).

All in all I hope to reduce codebase complexity and simplify documentation. Any comments are welcome.

Parametrize tests

A lot of the tests have a for loop over the CONFIG dictionary. It would be nice to convert this to parametrization via pytest.

MAINT: catch warning whenever using random_state in tests

CuPy arrays

Right now this library is tied to NumPy arrays pretty heavily. Will this library work with CuPy arrays? CuPy arrays are NumPy arrays for CUDA GPUs and are nearly a drop-in replacement for NumPy arrays. That'd provide a method to use GPU models + GPU data easily.

Kera's model.fit function only claims to work with the following:

NumPy arrays
Tensorflow Tensors
tf.data datasets
A generator or keras.utils.Sequence returning (inputs, targets)

I think it'd be worth exploring what's required to use CuPy arrays.

ENH: Add regex based parameter routing

Followup to #66

Validate inputs to Keras model

Currently, the number of outputs of the Keras model is checked:

scikeras/scikeras/wrappers.py

Lines 418 to 419 in 9f633f5

    
           if self.model_n_outputs_ != len(self.model_.outputs): 
        
               raise RuntimeError(

It'd be nice if the number of inputs could be checked too (and their shape/dtype). This should be possible according to the docs, which says a Keras Model has an .inputs attribute: https://keras.io/api/models/model/

MAINT: rename or remove _model_params

BaseWrapper._model_params is currently not used by SciKeras and is not public. We should either:

Rename it and make it public if it is useful to users.
Remove it.

MAINT: comply with SLEP10

SLEP010: https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html

TST: Inconsistent failing for TestSampleWeights

TestSampleWeights keeps failing randomly both on master and other branches. Overall, the test is not very well designed, so I think it needs some re-work. I'm going to disable it for now as part of #39 , but I'm opening this issue to re-work or delete it in the future.

conda environment - pip install - scipy 1.4.1 dependency

First off, thanks for your efforts with the development of this package!

I wanted to point out something that I noticed on install. I'm wanting to 'pip install' within a conda environment and was hoping the only dependencies I needed to worry about were documented in the pyproject.toml file in the [tool.poetry.dependencies] section so that pip didn't overwrite any of the conda-managed packages.

It appears there is some other dependency that is causing a forced load of scipy 1.4.1. I had scipy 1.5.2 installed in my conda environment but pip forced an uninstall of that version.

Any ideas what package may be causing this and/or how to get around this?

Default parameters of build_fn are not returned by get_params

I have a simple example:

X, y = keras_mnist()
model = KerasClassifier(build_fn=build_fn)
params = {"optimizer": ["rmsprop", "sgd", "adam"], "lr": loguniform(1e-3, 1e0)}
search = RandomizedSearchCV(model, params)

This throws an error:

ValueError: Invalid parameter lr for estimator KerasClassifier(
        build_fn=<function _keras_build_fn at 0x14c523620>
). Check the list of available parameters with `estimator.get_params().keys()`.

This is inconvenient. I know there's documentation surrounding this:

scikeras/scikeras/wrappers.py

Lines 141 to 144 in 65d418e

    
               When using scikit-learn's `grid_search` API, legal tunable parameters are 
        
               those you could pass to `sk_params`, including fitting parameters. 
        
               In other words, you could use `grid_search` to search for the best 
        
               `batch_size` or `epochs` as well as the model parameters.

Is there a way to make it so the defaults of build_fn are also accepted parameters?

Here's a full example to produce the traceback:

from tensorflow.keras.datasets import mnist
from scikeras import KerasClassifier

def keras_mnist() -> Tuple[np.ndarray, np.ndarray]:
    (X_train, y_train), _ = mnist.load_data()
    X_train = X_train.reshape(X_train.shape[0], 784)
    X_train = X_train.astype("float32")
    X_train /= 255
    Y_train = tf.keras.utils.to_categorical(y_train, 10)
    return X_train, y_train

def build_fn(optimizer="rmsprop", lr=0.01, kernel_initializer="glorot_uniform"):
    model = Sequential()
    model.add(Dense(512, input_shape=(784,)))
    model.add(Activation("relu"))
    model.add(Dense(512, kernel_initializer=kernel_initializer))
    model.add(Activation("relu"))
    model.add(Dense(10, kernel_initializer=kernel_initializer))
    model.add(Activation("softmax"))
    model.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["accuracy"])
    return model


X, y = keras_mnist()
model = KerasClassifier(build_fn=build_fn)
params = {"optimizer": ["rmsprop", "sgd", "adam"], "lr": loguniform(1e-3, 1e0)}
search = RandomizedSearchCV(model, params)

A problem about generating BaggingClassifier with wrapped classifier

Hi, adrian:
Really appreciate your work. i read your some comments in other issues.
However, i have a question. When generating a BaggingClassifier with classifier wrapped from keras:

classifier = sklearn_keras_wrap.wrappers.KerasClassifier(build_fn=model, batch_size=64, epochs=1)
classifier.fit(train_x, train_y, batch_size=64, epochs=1)
ensemble_clf = BaggingClassifier(classifier, n_estimators=n_estimators)
ensemble_clf.fit(train_x, train_y)  #error

An error will occur:

rv = reductor(4)
TypeError: can't pickle _thread.RLock objects

It is mentioned many times. Do you have any targeted suggestions for me about modifying bagging.py?
Thank you very much. Look forward your reply.

Followups to #66

Possible followups from discussion in #66

Adding regex and/or globbing based parameter routing using a __params_group suffix.
Catching user errors when users compile their own models (#66 (comment), done in #86)
More user-friendly warnings when parameters are passed to an item that is not a class (#66 (comment))
A bidirectional mapping between shorthand names for losses/metrics/optimizers and their functions/classes (#66 (comment)) Unidirectional mapping implemented in #88.
Consider making utilities public and refactor testing to remove tests for private functions.

Parameter routing

As per #37 (comment), it might be nice to have some parameter routing and/or renaming build_fn. I'd support this interface:

def build_keras_model(hidden_dim=10, activation="sigmoid"):
    ...
    return model

est = KerasClassifier(
    model=build_keras_model,
    model__hidden_dim=20,
    model__activation="relu",
    batch_size=256,
    validation_frac=0.2,
)

This mirrors the interface that Skorch has. They support overriding some keywords like optimizer__lr with lr, or iteratior_valid__batch_size and iterator_train__batch_size with batch_size: https://skorch.readthedocs.io/en/stable/user/neuralnet.html#batch-size

User-specified loss function may not be reflected in compiled model

As discussed in #88 (comment), if the user specifies a loss function, it might get over-ridden by build_fn. That is, this is possible:

>>> wrapper = KerasRegressor(..., loss="mean_abs_error")
>>> wrapper._initialize()
# UserWarning: loss='mean_abs_error' but model compiled with 'mse'
>>> wrapper.model_.loss
"mse"

This is specifically the case when the default loss is changed; for KerasRegressor, the default is loss=None. When the user requests loss="mean_abs_error" but gets loss="mse", I think an error should be raised.

	This method will process all arguments and call the model building
	function with appropriate arguments.

	def root_mean_squared_error(y_true, y_pred):
	"""A simple Keras implementation of R^2 that can be used as a Keras
	loss function.

	Since ScikitLearn's `score` uses R^2 by default, it is
	advisable to use the same loss/metric when optimizing the model.

	ss_res = k_backend.sum(k_backend.square(y_true - y_pred), axis=0)
	ss_tot = k_backend.sum(
	k_backend.square(y_true - k_backend.mean(y_true, axis=0)), axis=0
	)
	return k_backend.mean(
	1 - ss_res / (ss_tot + k_backend.epsilon()), axis=-1
	)

	if self.model_n_outputs_ != len(self.model_.outputs):
	raise RuntimeError(

	When using scikit-learn's `grid_search` API, legal tunable parameters are
	those you could pass to `sk_params`, including fitting parameters.
	In other words, you could use `grid_search` to search for the best
	`batch_size` or `epochs` as well as the model parameters.

adriangb / scikeras Goto Github PK

scikeras's Introduction

Scikit-Learn Wrapper for Keras

Why SciKeras

Installation

Migrating from keras.wrappers.scikit_learn

Documentation

Contributing

scikeras's People

Contributors

Stargazers

Watchers

Forkers

scikeras's Issues

Background

Important principles

Next steps

Recommend Projects

Recommend Topics

Recommend Org

Migrating from `keras.wrappers.scikit_learn`