Code Monkey home page Code Monkey logo

Comments (6)

grovduck avatar grovduck commented on June 26, 2024 1

Should we stick with transform_ for the attribute, or do you think we should go with transformer_?

Ooo, maybe transformer_ is better to pair with the class names (CCATransformer, MahalanobisTransformer, etc.). Maybe we've gone back and forth on this one 😉

from sknnr.

grovduck avatar grovduck commented on June 26, 2024

@aazuspan, I like it! I especially like the prevention of instantiation by code rather than by convention and, as you say, it gets rid of some duplicated code as well. A couple of quick questions/comments just to make sure I get it:

  • GNN and MSN would be, for example, GNNRegressor(YFitMixin, TransformedKNeighborsRegressor), correct? Just wanted to verify that YFitMixin.fit gets called before TransformedKNeighborsRegressor.fit ...
  • I don't know if this was intentional, but _set_fit_transform could be confused with the more generic fit_transform method defined in each transformer, which fits/transforms in a single step. But I think the intention of _set_fit_transform is really just to set the transform? I actually wonder if it's even necessary to have that method - perhaps this instead? Is there an advantage to this that I'm overlooking?
class TransformedKNeighborsRegressor(RawKNNRegressor, ABC):
    ...

    @abstractmethod
    def _get_transform(self) -> TransformerMixin:
        """Return the transformer to use for fitting. Must be implemented by subclasses."""
        ...

    def fit(self, X, y):
        """Fit using transformed feature data."""
        self._validate_data(X, y, force_all_finite=True, multi_output=True)        
        self._set_dataframe_index_in(X)
        self.transform_ = self._get_transform().fit(X, y)

        X_transformed = self.transform_.transform(X)
        return super().fit(X_transformed, y)

from sknnr.

aazuspan avatar aazuspan commented on June 26, 2024

GNN and MSN would be, for example, GNNRegressor(YFitMixin, TransformedKNeighborsRegressor), correct?

Bingo!

I don't know if this was intentional, but _set_fit_transform could be confused with the more generic fit_transform method defined in each transformer, which fits/transforms in a single step. But I think the intention of _set_fit_transform is really just to set the transform?

Yes, great point! That was unintentional - we should definitely choose a better name. Is _set_fitted_transform still too similar? Or _set_fitted_transformer (probably combined with _get_transformer)?

I actually wonder if it's even necessary to have that method - perhaps this instead? Is there an advantage to this that I'm overlooking?

Separating out the transformer fitting is a workaround to an inheritance problem that arises when YFitMixin.fit calls super().fit, which calls TransformedKNeighborsRegressor.fit and refits the transformer with the wrong y data. I also played around with 1) adding a reset=True argument to TransformedKNeighborsRegressor.fit to avoid resetting the transformer if called intentionally, and 2) having TransformedKNeighborsRegressor.fit accept **kwargs and fit the transformer with y_fit if present. Those generally worked, but I wanted to keep as much of the y_fit logic out of TransformedKNeighborsRegressor as possible, and having a separate fitting method allowed that.

It's 100% possible there's a better workaround that I didn't think of though!

from sknnr.

grovduck avatar grovduck commented on June 26, 2024

Yes, great point! That was unintentional - we should definitely choose a better name. Is _set_fitted_transform still too similar? Or _set_fitted_transformer (probably combined with _get_transformer)?

I guess I have a slight preference for the latter, but either is fine by me.

Separating out the transformer fitting is a workaround to an inheritance problem that arises when YFitMixin.fit calls super().fit, which calls TransformedKNeighborsRegressor.fit and refits the transformer with the wrong y data. I also played around with 1) adding a reset=True argument to TransformedKNeighborsRegressor.fit to avoid resetting the transformer if called intentionally, and 2) having TransformedKNeighborsRegressor.fit accept **kwargs and fit the transformer with y_fit if present. Those generally worked, but I wanted to keep as much of the y_fit logic out of TransformedKNeighborsRegressor as possible, and having a separate fitting method allowed that.

Oof, let's avoid that complexity! Having a separate _set_fitted_transform or _set_fitted_transformer method sounds like a preferable solution.

from sknnr.

aazuspan avatar aazuspan commented on June 26, 2024

I guess I have a slight preference for the latter, but either is fine by me.

Agreed! Should we stick with transform_ for the attribute, or do you think we should go with transformer_?

Oof, let's avoid that complexity!

Sounds like a plan! I already worked most of this up to make sure it was possible, so I'll make a PR shortly.

from sknnr.

aazuspan avatar aazuspan commented on June 26, 2024

Resolved by #52

from sknnr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.