secondmind-labs / gpflux Goto Github PK

View Code? Open in Web Editor NEW

118.0 13.0 24.0 188.91 MB

Deep GPs built on top of TensorFlow/Keras and GPflow

Home Page: https://secondmind-labs.github.io/GPflux/

License: Apache License 2.0

Makefile 1.45% Python 98.55%

gpflux's Introduction

GPflux

Documentation | Tutorials | API reference | Slack

What does GPflux do?

GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hierarchical extension of Gaussian processes (GP).

GPflux uses the mathematical building blocks from GPflow and marries these with the powerful layered deep learning API provided by Keras. This combination leads to a framework that can be used for:

researching new (deep) Gaussian process models, and
building, training, evaluating and deploying (deep) Gaussian processes in a modern way — making use of the tools developed by the deep learning community.

Getting started

In the Documentation, we have multiple Tutorials showing the basic functionality of the toolbox, a benchmark implementation and a comprehensive API reference.

Install GPflux

This project is assuming you are using python3.

For users

To install the latest (stable) release of the toolbox from PyPI, use pip:

$ pip install gpflux

For contributors

To install this project in editable mode, run the commands below from the root directory of the GPflux repository.

make install

Check that the installation was successful by running the tests:

make test

You can have a peek at the Makefile for the commands.

The Secondmind Labs Community

Getting help

Bugs, feature requests, pain points, annoying design quirks, etc: Please use GitHub issues to flag up bugs/issues/pain points, suggest new features, and discuss anything else related to the use of GPflux that in some sense involves changing the GPflux code itself. We positively welcome comments or concerns about usability, and suggestions for changes at any level of design. We aim to respond to issues promptly, but if you believe we may have forgotten about an issue, please feel free to add another comment to remind us.

Slack workspace

We have a public Secondmind Labs slack workspace. Please use this invite link and join the #gpflux channel, whether you'd just like to ask short informal questions or want to be involved in the discussion and future development of GPflux.

Contributing

All constructive input is very much welcome. For detailed information, see the guidelines for contributors.

Maintainers

GPflux was originally created at Secondmind Labs and is now actively maintained by (in alphabetical order) Vincent Dutordoir and ST John. We are grateful to all contributors who have helped shape GPflux.

GPflux is an open source project. If you have relevant skills and are interested in contributing then please do contact us (see "The Secondmind Labs Community" section above).

We are very grateful to our Secondmind Labs colleagues, maintainers of GPflow, Trieste and Bellman, for their help with creating contributing guidelines, instructions for users and open-sourcing in general.

Citing GPflux

To cite GPflux, please reference our arXiv paper where we review the framework and describe the design. Sample Bibtex is given below:

@article{dutordoir2021gpflux,
    author = {Dutordoir, Vincent and Salimbeni, Hugh and Hambro, Eric and McLeod, John and
        Leibfried, Felix and Artemev, Artem and van der Wilk, Mark and Deisenroth, Marc P.
        and Hensman, James and John, ST},
    title = {GPflux: A library for Deep Gaussian Processes},
    year = {2021},
    journal = {arXiv:2104.05674},
    url = {https://arxiv.org/abs/2104.05674}
}

License

Apache License 2.0

gpflux's People

Contributors

Stargazers

Watchers

gpflux's Issues

GPLayer doesn't seem to support multiple input units

Using the "Hybrid Deep GP models: ..." tutorial, when I changed
tf.keras.layers.Dense(1, activation="linear"),
to
tf.keras.layers.Dense(2, activation="linear"),
I got an error, same as reported in #27.

I then set a Zero mean function, as suggested in #27:
gp_layer = gpflux.layers.GPLayer(
kernel, inducing_variable, num_data=num_data, num_latent_gps=output_dim,
mean_function=gpflow.mean_functions.Zero()
),
I got a different error.

dtype errors on using the model if keras backend is not set to float64

Default dtype for keras is float32 while gpflow uses float64, so if one tries to use the model without setting the keras backend to float64 dtype clash will occur and exception is raised,

The problem with setting the backend occurs when multiple models with both float32 and 64 are used in a library, as its a global setting. The fix seems to be easy, fixing lines 99 and 100 by setting the dtype to tf.float64.

DistributionLambda for LikelihoodLayer

The likelihood layer could be a DistributionLambda that returns the appropriate tfp distribution - in general, this would be a MixtureSameFamily of the tfp distribution describing the likelihood. This is closely related to GPflow/GPflow#1345 and might require some work/coordination with GPflow upstream. We'd like to keep the analytic variational expectations (part of GPflow, not part of tfp distributions). At a minimum, we could create an explicit mapping between gpflow likelihoods and tfp distributions.

Incorrect WilsonSample for multioutput case

Describe the bug
prior_weights should be independent and not shared across output dimension in gpflux\sampling\sample.py:_efficient_sample_matheron_rule

DistributionLambda for LatentVariableLayer

By turning LatentVariableLayer into a DistributionLambda, it could return the prior or posterior as appropriate (depending on training mode indicator).

(Note: this would require moving the composition with inputs (through addition or concatenation) out of the LatentVariableLayer again.)

Add version upper bound before adjust gpflow>=2.6

Describe the bug
GPflow>=2.6 seems to change the way to calculate likelihood p(Y|F,X) from by using F and Y to by using X, F and Y.
I guess this change would be incompatible to the current gpflux develop branch.

To reproduce
Steps to reproduce the behaviour:

git clone https://github.com/secondmind-labs/GPflux.git
pip install -e .
python ./gpflux/docs/notebooks/intro.py

An error will occur at Line 99.

System information

OS: Ubuntu20.04
Python version: 3.9.13
GPflux version:
develop branch f95e1cb
TensorFlow version: 2.8.3
GPflow version: 2.6.1

Additional context
It would be ok if switching to gpflow==2.5.2

ModelCheckpoint with save_weights_only=False crashes

Describe the bug
Literally the title

import numpy as np
import tensorflow as tf
from gpflux.helpers import (
    construct_basic_inducing_variables,
    construct_basic_kernel,
    construct_mean_function,
    construct_gp_layer
)
from gpflux.models import DeepGP
from gpflux.layers import LikelihoodLayer
from gpflow.kernels import SquaredExponential
from gpflow.likelihoods import Gaussian
from gpflow import set_trainable

def xiong_1d(XX: np.ndarray):
    return -0.5*(np.sin(40*(XX-0.85)**4) * np.cos(2.5*(XX-0.95)) + 0.5*(XX-0.9) + 1)

X_valid = np.linspace(0, 1, 1000).reshape(-1, 1)
Y_valid = xiong_1d(X_valid).reshape(-1, 1)
X = np.linspace(0, 1, 30).reshape(-1, 1)
Y = xiong_1d(X).reshape(-1, 1)

# # -------------- DGP MODEL -------------- #

n_inducing = int(len(X))

layer1 = construct_gp_layer(num_data=X.shape[0],
                            num_inducing=X.shape[0],
                            input_dim=X.shape[1],
                            output_dim=X.shape[1],
                            kernel_class=SquaredExponential,
                            )
layer2 = construct_gp_layer(num_data=X.shape[0],
                            num_inducing=X.shape[0],
                            input_dim=X.shape[1],
                            output_dim=Y.shape[1],
                            kernel_class=SquaredExponential,
                            )
gp_layers = [layer1, layer2]
likelihood = Gaussian(variance=1e-5)
set_trainable(likelihood.variance, False)

dgp_model = DeepGP(f_layers=gp_layers, likelihood=LikelihoodLayer(likelihood))
train_mode = dgp_model.as_training_model()
train_mode.compile(tf.optimizers.Adam(0.01))

# File path of script "E:/23620029-Faiz/.PROJECTS/AdaptiveDGP/demo"
checkpoint_filepath = "E:/23620029-Faiz/.PROJECTS/AdaptiveDGP/checkpoint"
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=False,
    monitor='loss',
    mode='max',
    save_best_only=True)

train_mode.fit({"inputs": X, "targets": Y}, epochs=5000, verbose=1,  callbacks=[model_checkpoint_callback])`

The Error/Console Output

Epoch 1/5000
1/1 [==============================] - 4s 4s/step - loss: 59876.1801 - gp_layer_prior_kl: 0.0000e+00 - gp_layer_1_prior_kl: 0.0000e+00
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "E:\23620029-Faiz\PyCharm\PyCharm Community Edition 2021.1.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "E:\23620029-Faiz\PyCharm\PyCharm Community Edition 2021.1.1\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "E:/23620029-Faiz/.PROJECTS/AdaptiveDGP/demo/test_gpflux.py", line 181, in <module>
    train_mode.fit({"inputs": X, "targets": Y}, epochs=5000, verbose=1,  callbacks=[model_checkpoint_callback])
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1145, in fit
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\callbacks.py", line 428, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1344, in on_epoch_end
    self._save_model(epoch=epoch, logs=logs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1396, in _save_model
    self.model.save(filepath, overwrite=True, options=self._options)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2001, in save
    save.save_model(self, filepath, overwrite, include_optimizer, save_format,
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\save.py", line 156, in save_model
    saved_model_save.save(model, filepath, overwrite, include_optimizer,
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save.py", line 89, in save
    save_lib.save(model, filepath, signatures, options)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\saved_model\save.py", line 1032, in save
    _, exported_graph, object_saver, asset_info = _build_meta_graph(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\saved_model\save.py", line 1198, in _build_meta_graph
    return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\saved_model\save.py", line 1132, in _build_meta_graph_impl
    signatures = signature_serialization.find_function_to_export(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\saved_model\signature_serialization.py", line 75, in find_function_to_export
    functions = saveable_view.list_functions(saveable_view.root)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\saved_model\save.py", line 150, in list_functions
    obj_functions = obj._list_functions_for_serialization(  # pylint: disable=protected-access
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2612, in _list_functions_for_serialization
    functions = super(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 3086, in _list_functions_for_serialization
    return (self._trackable_saved_model_saver
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\base_serialization.py", line 94, in list_functions_for_serialization
    fns = self.functions_to_serialize(serialization_cache)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\layer_serialization.py", line 78, in functions_to_serialize
    return (self._get_serialized_attributes(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\layer_serialization.py", line 94, in _get_serialized_attributes
    object_dict, function_dict = self._get_serialized_attributes_internal(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\model_serialization.py", line 56, in _get_serialized_attributes_internal
    super(ModelSavedModelSaver, self)._get_serialized_attributes_internal(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\layer_serialization.py", line 104, in _get_serialized_attributes_internal
    functions = save_impl.wrap_layer_functions(self.obj, serialization_cache)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 163, in wrap_layer_functions
    call_fn_with_losses = call_collection.add_function(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 505, in add_function
    self.add_trace(*self._input_signature)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 420, in add_trace
    trace_with_training(True)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 418, in trace_with_training
    fn.get_concrete_function(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 550, in get_concrete_function
    return super(LayerCall, self).get_concrete_function(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 1299, in get_concrete_function
    concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 1205, in _get_concrete_function_garbage_collected
    self._initialize(args, kwargs, add_initializers_to=initializers)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 725, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\function.py", line 2969, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\function.py", line 3196, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\framework\func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 527, in wrapper
    ret = method(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 169, in wrap_with_training_arg
    return control_flow_util.smart_cond(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\utils\control_flow_util.py", line 114, in smart_cond
    return smart_module.smart_cond(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\framework\smart_cond.py", line 54, in smart_cond
    return true_fn()
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 170, in <lambda>
    training, lambda: replace_training_and_call(True),
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 167, in replace_training_and_call
    return wrapped_call(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 570, in call_and_return_conditional_losses
    call_output = layer_call(inputs, *args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 424, in call
    return self._run_internal_graph(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 560, in _run_internal_graph
    outputs = node.layer(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 1012, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 73, in return_outputs_and_add_losses
    outputs, losses = fn(inputs, *args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 169, in wrap_with_training_arg
    return control_flow_util.smart_cond(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\utils\control_flow_util.py", line 114, in smart_cond
    return smart_module.smart_cond(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\framework\smart_cond.py", line 54, in smart_cond
    return true_fn()
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 170, in <lambda>
    training, lambda: replace_training_and_call(True),
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 167, in replace_training_and_call
    return wrapped_call(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 544, in __call__
    self.call_collection.add_trace(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 420, in add_trace
    trace_with_training(True)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 418, in trace_with_training
    fn.get_concrete_function(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 550, in get_concrete_function
    return super(LayerCall, self).get_concrete_function(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 1299, in get_concrete_function
    concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 1205, in _get_concrete_function_garbage_collected
    self._initialize(args, kwargs, add_initializers_to=initializers)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 725, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\function.py", line 2969, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\function.py", line 3196, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\framework\func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 527, in wrapper
    ret = method(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 169, in wrap_with_training_arg
    return control_flow_util.smart_cond(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\utils\control_flow_util.py", line 114, in smart_cond
    return smart_module.smart_cond(
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\framework\smart_cond.py", line 54, in smart_cond
    return true_fn()
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 170, in <lambda>
    training, lambda: replace_training_and_call(True),
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py", line 167, in replace_training_and_call
    return wrapped_call(*args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 570, in call_and_return_conditional_losses
    call_output = layer_call(inputs, *args, **kwargs)
  File "E:\23620029-Faiz\.PROJECTS\AdaptiveDGP\venv\lib\site-packages\gpflux\layers\likelihood_layer.py", line 78, in call
    assert isinstance(inputs, tfp.distributions.MultivariateNormalDiag)
AssertionError

System information

OS: Windows 10 x64
Python version: 3.8
GPflux version: 0.1.0
TensorFlow version: 2.4.0
GPflow version: 2.2.1

More DistributionLambda in layers

LikelihoodLayer could return the correctly parametrized y-distribution (perhaps a mixture if using multiple samples?)
LatentVariableLayer could be a tfp.DistributionLambda and return prior/posterior as appropriate (we'd then need to take out again the composition with inputs through addition or concatenation).
Encoder could also return the distribution directly instead of keeping track of the class of the prior.

Deep GP fit on Step Data

Describe the bug

I want to fit a Deep GP on step data, so I am using the method shown in GPflux tutorial on the motorcycle dataset. But the fit is not as expected, as shown in Prof.Neil Lawerence's blog. I can fit using PyDeepGP. I have attached the code used by me in GPflux and PyDeepGP

To reproduce
Steps to reproduce the behaviour:

GPflux Implementation

```

try:
    import gpflux
except ModuleNotFoundError:
    %pip install gpflux
    import gpflux

from gpflux.architectures import Config, build_constant_input_dim_deep_gp
from gpflux.models import DeepGP

try:
    import tensorflow as tf
except ModuleNotFoundError:
    %pip install tensorflow
    import tensorflow as tf
    
import numpy as np
import pandas as pd
import gpflow
import gpflux

from gpflux.architectures import Config, build_constant_input_dim_deep_gp
from gpflux.models import DeepGP

tf.keras.backend.set_floatx("float64")
tf.get_logger().setLevel("INFO")


## Data

num_low = 25
num_high = 25
gap = -.1
noise = 0.0001
x = np.vstack((np.linspace(-1, -gap/2.0, num_low)[:, np.newaxis],
                np.linspace(gap/2.0, 1, num_high)[:, np.newaxis])).reshape(-1,)
y = np.vstack((np.zeros((num_low, 1)), np.ones((num_high, 1))))
scale = np.sqrt(y.var())
offset = y.mean()
yhat = ((y-offset)/scale).reshape(-1,)

## Model
config = Config(
    num_inducing=x.shape[0], inner_layer_qsqrt_factor=1e-3, likelihood_noise_variance=1e-3, whiten=True
)
deep_gp: DeepGP = build_constant_input_dim_deep_gp(
    np.array(x.reshape(-1, 1)), num_layers=4, config=config)

training_model: tf.keras.Model =deep_gp.as_training_model()

training_model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.01))

callbacks = [ tf.keras.callbacks.ReduceLROnPlateau("loss", factor=0.95, patience=3, min_lr=1e-6, verbose=0),
                      gpflux.callbacks.TensorBoard(),
                      tf.keras.callbacks.ModelCheckpoint(filepath="ckpts/", save_weights_only=True, verbose=0),]


history = training_model.fit(
    {"inputs": x.reshape(-1, 1),
     "targets": y.reshape(-1, 1)},
    batch_size=6,
    epochs=1000,
    callbacks=callbacks,
    verbose=0,
)

## Predict
def plot(model, X, Y, ax=None):
    if ax is None:
        fig, ax = plt.subplots()
    x = X
    x_margin = 1.0
    N = 50
    X = np.linspace(X.min() - x_margin, X.max() + x_margin, N).reshape(-1, 1)
    out = model(X)
    mu = out.f_mean.numpy().squeeze()
    var = out.f_var.numpy().squeeze()
    X = X.squeeze()
    lower = mu - 2 * np.sqrt(var)
    upper = mu + 2 * np.sqrt(var)
    ax.set_ylim(Y.min() - 0.5, Y.max() + 0.5)
    ax.plot(x, Y, "kx", alpha=0.5)
    ax.plot(X, mu, "C1")
    ax.set_xlim(-2, 2)
    ax.fill_between(X, lower, upper, color="C1", alpha=0.3)


prediction_model = deep_gp.as_prediction_model()
plot(prediction_model, x.reshape(-1, 1), y.reshape(-1, 1))

Plot obtained as a result of the above code:

Expected behaviour

PyDeepGP Implementation

```

try:
    import deepgp
except ModuleNotFoundError:
    %pip install git+https://github.com/SheffieldML/PyDeepGP.git
    import deepgp

try:
    import GPy
except ModuleNotFoundError:
    %pip install -qq GPy
    import GPy

try:
    import tinygp
except ModuleNotFoundError:
    %pip install -q tinygp
    import tinygp

import seaborn as sns
import jax
import jax.numpy as jnp
import matplotlib.pyplot as plt
from tinygp import kernels, GaussianProcess
from jax.config import config

import numpy as np

try:
    import jaxopt
except ModuleNotFoundError:
    %pip install jaxopt
    import jaxopt
config.update("jax_enable_x64", True)


num_low = 25
num_high = 25
gap = -0.1
noise = 0.0001
x = jnp.vstack(
    (jnp.linspace(-1, -gap / 2.0, num_low)[:, jnp.newaxis], jnp.linspace(gap / 2.0, 1, num_high)[:, jnp.newaxis])
).reshape(
    -1,
)
y = jnp.vstack((jnp.zeros((num_low, 1)), jnp.ones((num_high, 1))))
scale = jnp.sqrt(y.var())
offset = y.mean()
yhat = ((y - offset) / scale).reshape(
    -1,
)
xnew = jnp.vstack(
    (jnp.linspace(-2, -gap / 2.0, 25)[:, jnp.newaxis], jnp.linspace(gap / 2.0, 2, 25)[:, jnp.newaxis])
).reshape(
    -1,
)


num_hidden = 3
latent_dim = 1

kernels = [*[GPy.kern.RBF(latent_dim, ARD=True)] * num_hidden]  # hidden kernels
kernels.append(GPy.kern.RBF(np.array(x.reshape(-1, 1)).shape[1]))  # we append a kernel for the input layer

m = deepgp.DeepGP(

    [y.reshape(-1, 1).shape[1], *[latent_dim] * num_hidden, x.reshape(-1, 1).shape[1]],
    X=np.array(x.reshape(-1, 1)),  # training input
    Y=np.array(y.reshape(-1, 1)),  # training outout
    inits=[*["PCA"] * num_hidden, "PCA"],  # initialise layers
    kernels=kernels,
    num_inducing=x.shape[0],
    back_constraint=False,
)
m.initialize_parameter()



def optimise_dgp(model, messages=True):
    """Utility function for optimising deep GP by first
    reinitiailising the Gaussian noise at each layer
    (for reasons pertaining to stability)
    """
    model.initialize_parameter()
    for layer in model.layers:
        layer.likelihood.variance.constrain_positive(warning=False)
        layer.likelihood.variance = 1.0  # small variance may cause collapse
    model.optimize(messages=messages, max_iters=10000)


optimise_dgp(m, messages=True)


mu_dgp, var_dgp = m.predict(xnew.reshape(-1, 1))


plt.figure()

latexify(width_scale_factor=2, fig_height=1.75)
plt.plot(xnew, mu_dgp, "blue")
plt.scatter(x, y, c="r", s=marksize)
plt.fill_between(
    xnew.flatten(),
    mu_dgp.flatten() - 1.96 * jnp.sqrt(var_dgp.flatten()),
    mu_dgp.flatten() + 1.96 * jnp.sqrt(var_dgp.flatten()),
    alpha=0.3,
    color="C1",
)
sns.despine()
legendsize = 4.5 if is_latexify_enabled() else 9
plt.legend(labels=["Mean", "Data", "Confidence"], loc=2, prop={"size": legendsize}, frameon=False)
plt.xlabel("$x$")
plt.ylabel("$y$")
sns.despine()
plt.show()

Plot obtained from above code

System information

OS: Ubuntu 20.04.2 LTS
Python version: 3.10.4
GPflux version: 0.3.0
TensorFlow version: 2.8.2
GPflow version: 2.5.2

Bug when computing the mean of marginal variational q(f)

Describe the bug
I think there is a bug when computing the mean of q(f) = \int p(f|u)q(u), because the mean function evaluated at the inducing point locations is not added.

To reproduce

When computing q(f), the code is given by this line here:

GPflux/gpflux/layers/gp_layer.py

Line 257 in e05d7ba

mean_cond, cov = conditional(

Note that the function conditional computes, for the mean mean_cond : K_xz K_zz_inv m . After, \mu(x) is added in the return method

GPflux/gpflux/layers/gp_layer.py

Line 268 in e05d7ba

return mean_cond + mean_function, cov

However, I am quite sure that the variational mean is computed as:

K_xz K_zz_inv m + mu(x) -K_xz K_zz_inv mu(z),

which means that for mean functions different from the zero mean function, a term -K_xz K_zz_inv mu(z) has to be added to the variational mean.

This could be easily solved ( I think ) by calling the conditional as follows:

mean_cond, cov = conditional(
            inputs,
            self.inducing_variable,
            self.kernel,
            self.q_mu - self.mean_function(self.inducing_variables),  ## HERE IS THE MODIFICATION
            q_sqrt=self.q_sqrt,
            full_cov=full_cov,
            full_output_cov=full_output_cov,
            white=self.whiten,
        )

I have been checking the source code from Gpflow and it does not look like the term K_xz K_zz_inv mu(Z) is being considered.

In case this bug is confirmed, I will open the issue in GPFLOW as well, because the SVGP model, for example, has the same problem.

Customize NatGrad Model to turn off variational parameters in hidden layers

It seems that from the NatGradModel class, the requirement of having a NaturalGradient optimizer for each layer reduces flexibility for the user as this forces each layer variational parameters to be optimized. Even by setting the parameters off manually outside through set_trainable , tensorflow would throw a ValueError: None values not supported.
The issue is to set off the variational parameters off except for the last hidden layer
Any way to get around this issue?

  # Set all var params in inner layers off
  var_params = [(layer.q_mu, layer.q_sqrt) for layer in dgp_model.f_layers[:-1]]
  for vv in var_params:
      set_trainable(vv[0], False)
      set_trainable(vv[1], False)

  # Train Last Layer with NatGrad: (NOTE: this uses the given class from gpflux and not customized NatGradModel_)
  train_mode = NatGradWrapper(dgp_model.as_training_model())
  train_mode.compile([NaturalGradient(gamma=0.01), NaturalGradient(gamma=1.0), tf.optimizers.Adam(0.001)])
  history = train_mode.fit({"inputs": Xsc, "targets": Y}, epochs=int(5000), verbose=1)

I only got it to work by changing the _split_natgrad_params_and_other_vars and optimizer.setter functions. Although it works im not too sure whether it is correct.

class NatGradModel_(tf.keras.Model):

    @property
    def natgrad_optimizers(self) -> List[gpflow.optimizers.NaturalGradient]:
        if not hasattr(self, "_all_optimizers"):
            raise AttributeError(
                "natgrad_optimizers accessed before optimizer being set"
            )  # pragma: no cover
        if self._all_optimizers is None:
            return None  # type: ignore
        return self._all_optimizers

    @property
    def optimizer(self) -> tf.optimizers.Optimizer:

        if not hasattr(self, "_all_optimizers"):
            raise AttributeError("optimizer accessed before being set")
        if self._all_optimizers is None:
            return None
        return self._all_optimizers

    @optimizer.setter
    def optimizer(self, optimizers: List[NaturalGradient]) -> None:
        # # Remove AdamOptimizer Requirement
        if optimizers is None:
            # tf.keras.Model.__init__() sets self.optimizer = None
            self._all_optimizers = None
            return

        if optimizers is self.optimizer:
            # Keras re-sets optimizer with itself; this should not have any effect on the state
            return

        self._all_optimizers = optimizers

    def _split_natgrad_params_and_other_vars(
        self,
    ) -> List[Tuple[Parameter, Parameter]]:

        # self.layers[-1] is Likelihood Layer, self.layers[-2] is Input Layer,
        # Last hidden layer is self.layers[-3]
        variational_params = [(self.layers[-3].q_mu, self.layers[-3].q_sqrt)]

        return variational_params

    def _apply_backwards_pass(self, loss: tf.Tensor, tape: tf.GradientTape) -> None:
 
        variational_params = self._split_natgrad_params_and_other_vars()
        variational_params_vars = [
            (q_mu.unconstrained_variable, q_sqrt.unconstrained_variable)
            for (q_mu, q_sqrt) in variational_params
        ]

        variational_params_grads = tape.gradient(loss, (variational_params_vars))


        num_natgrad_opt = len(self.natgrad_optimizers)
        num_variational = len(variational_params)
        if len(self.natgrad_optimizers) != len(variational_params):
            raise ValueError(
                f"Model has {num_natgrad_opt} NaturalGradient optimizers, "
                f"but {num_variational} variational distributions"
            )  # pragma: no cover

        for (natgrad_optimizer, (q_mu_grad, q_sqrt_grad), (q_mu, q_sqrt)) in zip(
            self.natgrad_optimizers, variational_params_grads, variational_params
        ):
            natgrad_optimizer._natgrad_apply_gradients(q_mu_grad, q_sqrt_grad, q_mu, q_sqrt)


    def train_step(self, data: Any) -> Mapping[str, Any]:
        """
        The logic for one training step. For more details of the
        implementation, see TensorFlow's documentation of how to
        `customize what happens in Model.fit
        <https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit>`_.
        """
        from tensorflow.python.keras.engine import data_adapter

        data = data_adapter.expand_1d(data)
        x, y, sample_weight = data_adapter.unpack_x_y_sample_weight(data)

        with tf.GradientTape() as tape:
            y_pred = self.__call__(x, training=True)
            loss = self.compiled_loss(y, y_pred, sample_weight, regularization_losses=self.losses)

        self._apply_backwards_pass(loss, tape=tape)

        self.compiled_metrics.update_state(y, y_pred, sample_weight)
        return {m.name: m.result() for m in self.metrics}

The problem that im trying to reproduce is from https://github.com/ICL-SML/Doubly-Stochastic-DGP/blob/master/demos/using_natural_gradients.ipynb

However, even with the same settings, i am still unable to reproduce the results

GPLayer's prediction seems to be too confident

Looking at the "Hybrid Deep GP models: ..." tutorial, the GPLayer's prediction seems to be too confident, i.e. its uncertainty estimate (95% confidence level) does not cover the training data spread. Its prediction accuracy (mean), however, is very good, nearly identical to that of the neural network model obtained by removing the GPLayer.

When I replace the GPLayer with a TFP DenseVariational layer using Gaussin priors, the prediction accuracy is not as good. However, importantly, the uncertainty estimate is very good, covering the training data spread well.

Without good uncertainty estimate, the GPLayer seems to add little value over the neural network model, which already provides good prediction accuracy.

Usage of NatGradModel/Wrapper to train variational parameters

Is there a minimum example for training the variational parameters with the Natural Gradient optimizer of gpflux.optimization.keras_natgrad for a deepgp model?

GPflux for text classification?

Hey, many thanks for this project!
I am currently investigating GPs for binary (and one-class-) classification tasks and did some first experiments using pre-trained sentence embeddings for feature representation, PCA for dimension reduction and GPs (GPFlow) for classification.
It sounds promising to use a text embedding, some dense layers and a GP in an end-to-end fashion.
At a first glance, GPflux seems to offer this. After checking the gpflux tutorials (Hybrid Deep GP models), I am actually not sure how to define the inducing variables. Seems like they have to cover the expected data ranges in each latent space dimension, right? Furthermore, I am not sure if GPflux offers variational inference for binary classification. Any comments, suggestions, links that could help to build hybrid models are appreciated. Many thanks!
Kind regards
Jens

Optimisation problem in gpflux

Hi,

I'm using a two-layer deep GP with Poisson likelihood. With the default zero mean function, there is no problem for step:
history = model.fit({"inputs": x_train, "targets": y_train}, batch_size=batch_size1, epochs= 100, callbacks=callbacks, verbose=0) where we have 100 epochs.

However, when I add a constant mean such as -10, when epochs are greater than 10, it will always result in matrix inversion problem.

Traceback (most recent call last):
File "xx.py", line 464, in
history = model.fit({"inputs": x_train, "targets": y_train}, batch_size=batch_size1, epochs= args.nIter, callbacks=callbacks, verbose=0)
File "xx/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
tmp_logs = self.train_function(iterator)
File "xx/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 828, in call
result = self._call(*args, **kwds)
File "xx/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 855, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File "xxxx/python3.7/site-packages/tensorflow/python/eager/function.py", line 2943, in call
filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
File "xx/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "xxx/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 560, in call
ctx=ctx)
File "xxx/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input matrix is not invertible.
[[node gradient_tape/model/gp_layer_1/triangular_solve/MatrixTriangularSolve (defined at xx.py:464) ]] [Op:__inference_train_function_6603]

Errors may have originated from an input operation.
Input Source operations connected to node gradient_tape/model/gp_layer_1/triangular_solve/MatrixTriangularSolve:
model/gp_layer_1/Cholesky (defined at /gpfs/ts0/home/xx249/anaconda3/lib/python3.7/site-packages/gpflow/conditionals/util.py:56)

Function call stack:
train_function

However, optimisers if they can compute the objective at the start should not fail and crash when they try parameters that fail the objective (they should just keep the current objective and sample somewhere else, as what the Scipy optimiser does.

Just wondered if there is any solution for this?

Many thanks.

Attempting to learn models with multidimensional inputs leads to an error.

Thanks a lot for making this exciting project public! I'm not 100% sure if what I'm reporting is a bug of if this isn't supposed to work in GPflux, but here we go:

Describe the bug
Attempting to learn models with multidimensional inputs leads to an error.

To reproduce
First of all, the setup of a toy example and a GPflow SVGP-based version which works as expected:

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import gpflow
import gpflux
from gpflow.utilities import print_summary, set_trainable

tf.keras.backend.set_floatx("float64")
tf.get_logger().setLevel("INFO")

grid = np.meshgrid(np.linspace(0, np.pi*2, 20),
                   np.linspace(0, np.pi*2, 20))
X = np.column_stack(tuple(map(np.ravel, grid)))
Y = (np.sin(X[:, 0]) * np.sin(X[:, 1]))[:, None]

plt.contourf(grid[0], grid[1], Y.reshape(grid[0].shape))
plt.title("DATA")
plt.show()

num_data = len(X)
num_inducing = 10
output_dim = Y.shape[1]

kernel = (gpflow.kernels.SquaredExponential(active_dims=[0]) *
          gpflow.kernels.SquaredExponential(active_dims=[1]))
inducing_variable = gpflow.inducing_variables.InducingPoints(
    X[np.random.choice(X.shape[0], size=num_inducing, replace=False),:].copy()
)

#---------- SVGP
svgp = gpflow.models.SVGP(kernel, gpflow.likelihoods.Gaussian(), inducing_variable,
                          num_latent_gps=output_dim, num_data=num_data)
set_trainable(svgp.q_mu, False)
set_trainable(svgp.q_sqrt, False)
variational_params = [(svgp.q_mu, svgp.q_sqrt)]
natgrad_opt = gpflow.optimizers.NaturalGradient(gamma=0.1)
adam_opt = tf.optimizers.Adam(0.01)
minibatch_size = 10
train_dataset = tf.data.Dataset.from_tensor_slices(
    (X, Y)).repeat().shuffle(num_data)
iter_train = iter(train_dataset.batch(minibatch_size))
objective = svgp.training_loss_closure(iter_train, compile=True)

@tf.function
def optim_step():
    natgrad_opt.minimize(objective, var_list=variational_params)
    adam_opt.minimize(objective, svgp.trainable_variables)

for i in range(100):
    optim_step()
elbo = -objective().numpy()
print(f"it: {i} of dual-optimizer... elbo: {elbo}")


atgrid = np.meshgrid(np.linspace(0, np.pi*2, 40),
                     np.linspace(0, np.pi*2, 40))
atX = np.column_stack(tuple(map(np.ravel, atgrid)))

mean, var = svgp.predict_f(atX)
plt.contourf(atgrid[0], atgrid[1], mean.numpy().reshape(atgrid[0].shape))
plt.title("SVGP")
plt.show()

And here a single-layer DGP with GPflux:

#---------- DEEPGP
gp_layer = gpflux.layers.GPLayer(
    kernel, inducing_variable, num_data=num_data, num_latent_gps=output_dim
)

likelihood_layer = gpflux.layers.LikelihoodLayer(gpflow.likelihoods.Gaussian(0.1))

single_layer_dgp = gpflux.models.DeepGP([gp_layer], likelihood_layer)
model = single_layer_dgp.as_training_model()
model.compile(tf.optimizers.Adam(0.01))

log = model.fit({"inputs": X, "targets": Y}, epochs=int(100), verbose=1)

which throws the following error when reaching the last line of the example:

ValueError: in user code:

    venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    venv/lib/python3.7/site-packages/gpflux/layers/gp_layer.py:277 call  *
        outputs = super().call(inputs, *args, **kwargs)
    venv/lib/python3.7/site-packages/tensorflow_probability/python/layers/distribution_layer.py:252 call  **
        inputs, *args, **kwargs)
    venv/lib/python3.7/site-packages/tensorflow/python/keras/layers/core.py:917 call
        result = self.function(inputs, **kwargs)
    venv/lib/python3.7/site-packages/tensorflow_probability/python/layers/distribution_layer.py:172 _fn
        d = make_distribution_fn(*fargs, **fkwargs)
    venv/lib/python3.7/site-packages/gpflux/layers/gp_layer.py:328 _make_distribution_fn
        return tfp.distributions.MultivariateNormalDiag(loc=mean, scale_diag=tf.sqrt(cov))
    <decorator-gen-394>:2 __init__
        
    venv/lib/python3.7/site-packages/tensorflow_probability/python/distributions/distribution.py:298 wrapped_init
        default_init(self_, *args, **kwargs)
    venv/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py:538 new_func
        return func(*args, **kwargs)
    venv/lib/python3.7/site-packages/tensorflow_probability/python/distributions/mvn_diag.py:252 __init__
        name=name)
    <decorator-gen-322>:2 __init__
        
    venv/lib/python3.7/site-packages/tensorflow_probability/python/distributions/distribution.py:298 wrapped_init
        default_init(self_, *args, **kwargs)
    venv/lib/python3.7/site-packages/tensorflow_probability/python/distributions/mvn_linear_operator.py:190 __init__
        loc, scale)
    venv/lib/python3.7/site-packages/tensorflow_probability/python/internal/distribution_util.py:136 shapes_from_loc_and_scale
        'of `loc` ({}).'.format(event_size_, loc_event_size_))

    ValueError: Event size of `scale` (1) could not be broadcast up to that of `loc` (2).

Expected behaviour
I expected this to not throw an error, and produce a (at least qualitatively) similar result to the SVGP implementation, but again, I'm not sure if this expectation is justified.

System information

OS: Linux, kernel 5.4.112-1
Python version: 3.7.5
GPflux version: 0.1.0 from pip
TensorFlow version: 2.4.1
GPflow version: 2.1.5

More DGP architectures

We currently only have one properly benchmarked pre-configured DGP architecture:gpflux.architectures.constant_input_dim_deep_gp.py. We should add more. For example the ones in Salimbeni and Deisenroth, 2017.

Enquiry about model.predict() with batch_size and sampling from each layer after model fitting

Hi,
I'm writing to enquire about two use cases of deep GP.
1.
When I was using the two-layer deep GP to make predictions, when I pass batch_size to the function model.predict(), it seems that it only returns the predicted mean (f_mean) without f_var. However, as in your example below(https://secondmind-labs.github.io/GPflux/notebooks/gpflux_features.html:

def plot(model, X, Y, ax=None):
    if ax is None:
        fig, ax = plt.subplots()

    x_margin = 1.0
    N_test = 100
    X_test = np.linspace(X.min() - x_margin, X.max() + x_margin, N_test).reshape(-1, 1)
    **out = model(X_test)

    mu = out.f_mean.numpy().squeeze()
    var = out.f_var.numpy().squeeze()**
    X_test = X_test.squeeze()
    lower = mu - 2 * np.sqrt(var)
    upper = mu + 2 * np.sqrt(var)

    ax.set_ylim(Y.min() - 0.5, Y.max() + 0.5)
    ax.plot(X, Y, "kx", alpha=0.5)
    ax.plot(X_test, mu, "C1")

    ax.fill_between(X_test, lower, upper, color="C1", alpha=0.3)
prediction_model = deep_gp.as_prediction_model()

When no batch_size is passed, deep_gp.as_prediction_model() returns both f_mean and f_var. Could you please let me know how I can get both f_mean and f_var when passing batch_size to prediction_model.predict() ?

On this webpage, https://secondmind-labs.github.io/GPflux/notebooks/deep_gp_samples.html, you have given an example of making deep GP samples. However, I cannot see the model fitting part on the webpage. If want to make samples from each layer after model fitting , i.e., after model.compile(tf.optimizers.Adam(0.01)) and history = model.fit({"inputs": X, "targets": Y}, epochs=int(1e3), verbose=0), could you please let me know how I can achieve this?

Many thanks.

Remove TrackableLayer when TF 2.5 comes out

Once TF 2.5 is our we can remove our custom solution TrackableLayer to keep track of parameters inside modules. See tensorflow/tensorflow#47264 for details.

DistributionLambda for latent variable amortization/recognition networks/encoders

We could make the LatentVariableLayer interface more explicit by changing encoders from the status-quo "return parameters for the approximate posterior distribution" to a DistributionLambda that returns the posterior-approximating distribution itself.
This would make the behavior more explicit, and remove the need to keep track of what class the prior was (instead simply relying on tfp's kl_divergence implementations).

Refactor Tests

The unit and integration tests are growing, and we are reusing a lot of the same code, in particular to:

generate data
run training
compile models

We should have some common functions and fixtures which will both increase readability of code, reduce duplication and make most of pytest caching.

As part of this, could we introduce an assert_shapes() utility function instead of the kinda confusing assert np.all(tf.shape(tensor) == tensor_shape)?

Implement IWVI

We'd like GPflux to be able to implement importance-weighting as in https://github.com/hughsalimbeni/DGPs_with_IWVI/.

The challenge is that we need to keep the sampled per-LV-layer losses (local KLs), not summing them up as is the default. This probably means we'll have to write a custom Model subclass that has a train_step() that handles the custom IWVI objective...

Relax tensorflow versions

GPflux/setup.py

Line 12 in 7449437

"tensorflow>=2.5.0,<2.6.0",

Following up on #52: it appears that it is sufficient to keep the upper bound on tensorflow-probability only

What I've checked:

use https://secondmind-labs.github.io/GPflux/notebooks/intro.html as an example
- use tfp 0.16.0 and tf 2.7.0 -> assert isinstance(inputs, tfp.distributions.MultivariateNormalDiag) fails
- use tfp 0.13.0 and tf 2.7.0 -> works

How to save a DeepGP model

Dear all,

i'm sorry if this is a basic question, but:

Is it possible to save an optimised DeepGp model?

if so, how do you save it?

Thanks in advance

Protobuf minor version pin

Hello! I just wanted to make sure: do you really need to pin protobuf's minor version?

GPflux/setup.py

Line 17 in c8174ee

"protobuf~=3.19.0"

Fix for new GPflow heteroskedastic likelihood breaks for quadrature dependent likelihoods

A clear and concise description of what the bug is.
In #84 there were several changes made to accommodate the new framework in GPflow for heteroskedastic likelihoods. More precisely, no_X = None in gpflux/layers/likelihood_layer.py.

This works well with Gaussian or Student-t likelihoods, however it will break when using Softmax, which uses quadrature for variational_expectations or predict_mean_and_var. Both methods require access to the shape of X, so because currently we are passing None this results in an error.

Tutorial notebooks

What do we want to demonstrate in notebooks? (Please edit/update this issue description as appropriate.)

General introduction notebook (Vincent)
Efficient sampling (Vincent)
GPflux features: monitoring, tensorboard, saving (Vincent)
GPflux with neural net layers (Vincent)
Convolutional DGP (Artem?)
Latent Variable model (CDE paper) with extension to Importance Weighting? (Hugh?)

Clean up BayesianDenseLayer

It's currently a bit confusing. What is its relationship to tfp.layers.DenseVariational, what are its advantages? If we want to keep it, it needs an overhaul of its API documentation (and demonstrating its use in a notebook). We might then also want to subclass DenseVariational or turn it into a DistributionLambda of our own.

secondmind-labs / gpflux Goto Github PK

gpflux's Introduction

GPflux

What does GPflux do?

Getting started

Install GPflux

For users

For contributors

The Secondmind Labs Community

Getting help

Slack workspace

Contributing

Maintainers

Citing GPflux

License

gpflux's People

Contributors

Stargazers

Watchers

Forkers

gpflux's Issues

Recommend Projects

Recommend Topics

Recommend Org