As discussed in <a class="issue-link js-issue-link" data-error-text="Failed to load ti

Implemented here? <a href="https://github.com/goastler/sktime-kernels/blob/master/

slightly modified from <a class="issue-link js-issue-link" data-error-text="Failed to

looking at <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

Interesting. Tell me more about it (continuing from <a class="issue-l

Design 2nd degree transformer, and related composition patterns for kernel learners and distance learners about sktime HOT 16 CLOSED

mloning commented on May 22, 2024

Design 2nd degree transformer, and related composition patterns for kernel learners and distance learners

from sktime.

Comments (16)

fkiraly commented on May 22, 2024

Implemented here?
https://github.com/goastler/sktime-kernels/blob/master/sktime/transformers/kernels.py

Looks nice, @goastler !

from sktime.

fkiraly commented on May 22, 2024

ok, based on discussion on slack, here's a generic design and how it should interact with pipeline composition and tuning

from sktime.

fkiraly commented on May 22, 2024

slightly modified from #5 since we decided to not use xpandas

Part I. Pairwise transformer aka degree 2 transformer.

should be a class with parameters, attributes, methods.
Does not use abstract base class. Inherit from sklearn.BaseEstimator, or transformer

Parameters - private variables. Correspond to "hyper-parameters" to be set or tuned. Like in supervised estimator.
Attributes - private variables. Correspond to "model parameters" set by the fit method. Not to be set by the user or via an interface.

Constructor _ _ init _ _
arguments: all parameters, explicitly, with sensible default setting.
behavior: Sets self.parameters to the values provided.

public method fit

arguments:
X - a pandas data frame. no default.
optional argument:
sample_weight - a vector of weights, equal length to X.

behavior:
fits model and stores it in attribute variables. May access but not modify parameters.

public method transform

arguments:
X - an pandas data frame. no default. Column headers and types should be the same as of fit's X argument
Xnew - a pandas data frame. no default. Column headers and types should be the same as of fit's X argument
optional argument:
sample_weight - a vector of weights, equal length to X.

behavior:
Returns an named 3D array K. First dimension is indexed by rows of X. Second dimension is indexed by rows of Xnew. Third dimension is indexed, with transformed columns having headers defined by the transformer.
May access attributes and hyper-parameters. May not modify attributes and hyper-parameters.

public method get_params

arguments:
deep - a boolean indicating whether parameters of nested estimators should be returned

behaviour:
returns string -> value dictionary of parameters, following the sklearn naming convention (nested estimators' parameters by < estimatorname >__< parametername >, returned if and only if deep = true)

public method set_params

arguments:
string -> value dictionary of parameters, following the sklearn naming convention (nested estimators' parameters by < estimatorname >__< parametername >)
behaviour:
sets parameters, and nested parameters if provided, to the values as defined by the dictionary mapping

from sktime.

fkiraly commented on May 22, 2024

The counterpart of this, in composition is a "distance method", or "kernel method".

Part II: kernel/distance methods

params, get/setparams etc are all the same as in the estimator design (as above)

public method fit_deg2

arguments:
K - a matrix or xarray of size (N x N), e.g., a kernel or distance matrix. no default.
if supervised, also:
y - a pandas data frame with N rows. No default.
optional argument:
sample_weight - a vector of weights, length N.

behavior:
fits model and stores it in attribute variables. May access but not modify parameters.

public method pred_deg2

arguments:
kappa - a matrix or xarray of size (N x Nnew), e.g., a cross-kernel or cross-distance matrix. no default.
optional argument:
sample_weight - a vector of weights, length Nnew.

behavior:
returns model predictions, a data frame or vector of length Nnew

from sktime.

fkiraly commented on May 22, 2024

Composition pattern 1: pipeline

this would be realized by a class that inherits from estimator and behaves like whatever is at its end, usually a supervised kernel learner. I explain the below for the supervised learning case, the other cases are analogous (by dispatch)

Class pipeline_deg2

private variable trafo
private variable estim

Constructor _ _ init _ _
arguments: an instance of a descendant of transformer_deg2, and and instance of an appropriate descendant of estimator (plus mixin). estim needs to be a kernel/distance learner
behavior: stores the transformer as self.trafo and estimator as self.estim

public method fit

arguments:
X - a pandas data frame. no default.
y - a pandas data frame, of equal length. no default.
optional argument:
sample_weight - a vector of weights, equal length to X.

behavior:
trafo.fit(X)
K = trafo.transform(X)
estim.fit_deg2(K,y)

public method predict

arguments:
Xnew - a pandas data frame.
optional argument:
sample_weight - a vector of weights, equal length to Xnew.

behavior:
returns estim.pred_deg2(trafo.transform(Xnew))

from sktime.

fkiraly commented on May 22, 2024

Composition pattern 2: tuning

the composition
pipeline_deg2(mykernel(params),mysupkernellearner(moreparams))
is a supervised learner, with get_params/set_params accessing a joint parameter set - some coming from the kernel, some coming from the kernel learner.

This can now be passed to vanilla GridSearchCV, that is

GridSearchCV(pipeline_deg2(mykernel(params),mysupkernellearner(moreparams)), tuneparams)
being a tuned kernel learner that behaves like a supervised learner, e.g., classifier or regressor.

from sktime.

fkiraly commented on May 22, 2024

Composition pattern 3: kernel/distance learner composition

Factoring out the kernel as an object in its own right enables composition of such objects in their own right, such as:

multiple kernel learning
self-tuning kernels/distances, e.g., median distance heurstic, variogram fitting, etc
kernel reduction, i.e., building a kernel for sequences/series from a kernel for primitives, such as done in the string kernels, time warping kernels, etc

The generic pattern for such reduction is the compositional one:

myMKL([kernel1trafo(),kernel2trafo(), ..., kernelNtrafo()])
could be the degree 2 transformer that is the kernel mixture of the N component kernels and automatically fits the mixture parameters on the training data

Or, myDynamicTimeWarpingKernel(PrimitiveKernel(params),moreparams), and so on.

from sktime.

fkiraly commented on May 22, 2024

Minor comment: as composition operations become more complex, especially when used multiple times. Any opinions about operator overloading for pipeline/composite construction?

from sktime.

fkiraly commented on May 22, 2024

looking at @goastler's code, I think it's actually smart to combine the more heavy 2nd degree transformers with kernel functions and the current construction (or even a factory) pattern that makes transformers out of kernel functions (which already exists within @goastler's code).

I.e., have a class Kernel (aka "KernelFromKernelFunction") which is a 2nd degree transformer and which you construct with a kernel function. Of course not all 2nd degree transformers are of this type, but vanilla kernels will be.

from sktime.

fkiraly commented on May 22, 2024

Interesting. Tell me more about it
(continuing from #388 here since there it's off-topic)

@moradisten, many time series (classification) methods rely on a distance or kernel, in the way that it's a composite of the choice of kernel/distance and the method that computes distance/kernel matrices.

Therefore kernels and distances are natural encapsulation and abstraction points in the sense of a template pattern.

A concrete distance method would be a composite, following the sklearn estimator composition formalism.

Having a way to construct a method in the way Knn(MyFavouriteDistance(param1 = 2), k=4) with components all estimators would be great.

from sktime.

moradabaz commented on May 22, 2024

Seems pretty interesting. In fact, My research was focused on DTW distance measure, but I wanted to explore more distance measures like WDTW, DDTW o LCSS. So, that would be interesting to research and implement. But, I'd have to study sklearn a little bit more so I can understand the composition formalisms El vie., 4 sept. 2020 a las 16:09, fkiraly (<[email protected]>) escribió:

…

Interesting. Tell me more about it (continuing from #388 <#388> here since there it's off-topic) @moradisten <https://github.com/moradisten>, many time series (classification) methods rely on a distance or kernel, in the way that it's a composite of the choice of kernel/distance and the method that computes distance/kernel matrices. Therefore kernels and distances are natural encapsulation and abstraction points in the sense of a template pattern. A concrete distance method would be a composite, following the sklearn estimator composition formalism. Having a way to construct a method in the way Knn(MyFavouriteDistance(param1 = 2), k=4) with components all estimators would be great. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHEHQFDUO33GMR6RPINNC4DSEDYH5ANCNFSM4HLIAZIQ> .

from sktime.

fkiraly commented on May 22, 2024

But, I'd
have to study sklearn a little bit more so I can understand the composition
formalisms

@moradisten, for learning, I recommend:

implement some "simple" estimators, e.g., your favourite proximty forest, and make a pull request
study how composition works in sklearn, in particular BaseEstimator and why you can see parameters of components of a pipeline via get_params
understand the gaussian_process module in sklearn, particularly how the Kernel class and its children work

from sktime.

moradabaz commented on May 22, 2024

Alright, perfect. I'll see what I can do :) El sáb., 5 sept. 2020 a las 18:24, fkiraly (<[email protected]>) escribió:

…

But, I'd have to study sklearn a little bit more so I can understand the composition formalisms For learning, I recommend: - implement some "simple" estimators, e.g., your favourite proximty forest, and make a pull request - study how composition works in sklearn, in particular BaseEstimator and why you can see parameters of components of a pipeline via get_params - understand the gaussian_process module in sklearn, particularly how the Kernel class and its children work — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHEHQFGFLJ4WYEY7JAM5E43SEJQ2VANCNFSM4HLIAZIQ> .

from sktime.

fkiraly commented on May 22, 2024

@moradisten, great, let us know if/when/once you are interested to work on this.

from sktime.

moradabaz commented on May 22, 2024

Sure, I can start now that I'm on vacation right now. I just want to know how you work and the steps you follow so that I we can have good synchronization :) El lun., 7 sept. 2020 a las 18:01, fkiraly (<[email protected]>) escribió:

…

@moradisten <https://github.com/moradisten>, great, let us know if/when/once you are interested to work on this. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHEHQFA7EK6VCYAKHSQDWUTSET7XDANCNFSM4HLIAZIQ> .

from sktime.

fkiraly commented on May 22, 2024

this is concluded by the design now in the dists_kernels module.
Right, @chrisholder ?

from sktime.

Design 2nd degree transformer, and related composition patterns for kernel learners and distance learners about sktime HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent