Code Monkey home page Code Monkey logo

Comments (12)

MichielStock avatar MichielStock commented on May 29, 2024 1

This seems to work. I will make a page to implement your own kernel. Then we might see how to contribute more kernels for discrete structures.

from stheno.jl.

MichielStock avatar MichielStock commented on May 29, 2024 1

Ish -- in Stheno, all inputs are subtypes of AbstractVector. The example in the docs only considers AbstractVector{<:Real}, but purely because it's the most straightforward case.

OK

For high-dimensional inputs stored in a p x N design matrix we have the ColsAreObs struct. This ColsAreObs(X) is an AbstractVector{Vector{<:Real}} where each column of X corresponds to an input. You'll see that most of the kernels are specialised to this type.

Ah, I thought so. This is indeed a good approach because, ideally, you would like to make predictions about arbitrary abstract objects such as graphs, strings, or whatever.

The reason for doing this is to make it completely unambiguous how Stheno will interpret user data. It's also to avoid people complaining that the package adopts the "wrong" convention regarding whether a design matrix should be N x p or p x N -- ColsAreObs says it's p x N and we could add a RowsAreObs type with the opposite convention.

Well, RowsAreObs is the correct one of course ;-)

What do you think would be the most helpful way to explain this? Perhaps we should just add a section to the docs entitled Input Types.

Best add an example with high-dimensional prediction.

p.s. ColsAreObs possibly isn't the best name -- if you can think of an improved name I would greatly appreciate it. Perhaps ColVectors(X) and RowVectors(X) would be better?

ColsAreObs is good in the sense it is unambiguous. Maybe the latter is a bit more correct in nomenclature?

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

Hi @MichielStock thanks for raising the issue. Currently working towards the AISTATS deadline, will get back to you about this tomorrow :)

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

Just tried to run your example. I firstly realised that you've not imported ew or pw in your script, so you'll need to do that. You can first test that the JaccardKernel is working as expected with something like

using Stheno: GPC
f = GP(JaccardKernel(), GPC())
x = [[1,1], [2,1]]
rand(f(x)) # noise-free
rand(f(x, 0.1)) # noisy

I couldn't immediately get your code to work with the EQ kernel as it doesn't know how to handle inputs of type typeof(x), but we could definitely make something to make this work if that's something you're interested in doing?

edit: if you're interested in contributing the Jaccard kernel, I would very much open to a PR btw. And this might be a good candidate for an implement-your-own kernel tutorial in the docs...

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

Glad this works for you. I've got a draft of a how-to-implement-your-own-kernel page in the works at the minute and will open a PR involving it shortly for your feedback :)

from stheno.jl.

MichielStock avatar MichielStock commented on May 29, 2024

I think the example in the docs is an improvement.

Am I correct that the base case is always with a univariate input, i.e. pw(k::EQ, xl::AbstractVector{<:Real}, xr::AbstractVector{<:Real}), so x is a vector and is not high-dimensional by default. So if you want to regress X on y, where X is n x p, this is not supported?

(you can do it by having x a vector of feature vectors, e.g., [[1,2,3], [1,3,3], [3, 4, 2]], but this does not seem to be very elegant)

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

Ish -- in Stheno, all inputs are subtypes of AbstractVector. The example in the docs only considers AbstractVector{<:Real}, but purely because it's the most straightforward case.

For high-dimensional inputs stored in a p x N design matrix we have the ColsAreObs struct. This ColsAreObs(X) is an AbstractVector{Vector{<:Real}} where each column of X corresponds to an input. You'll see that most of the kernels are specialised to this type.

The reason for doing this is to make it completely unambiguous how Stheno will interpret user data. It's also to avoid people complaining that the package adopts the "wrong" convention regarding whether a design matrix should be N x p or p x N -- ColsAreObs says it's p x N and we could add a RowsAreObs type with the opposite convention.

What do you think would be the most helpful way to explain this? Perhaps we should just add a section to the docs entitled Input Types.

p.s. ColsAreObs possibly isn't the best name -- if you can think of an improved name I would greatly appreciate it. Perhaps ColVectors(X) and RowVectors(X) would be better?

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

Best add an example with high-dimensional prediction.

Sounds good. Will give this a go either tonight or tomorrow.

ColsAreObs is good in the sense it is unambiguous. Maybe the latter is a bit more correct in nomenclature?

Glad you agree that it's unambiguous. I think I'm going to refactor and go with ColVecs and RowVecs since they're a) a little bit more concise and b) stand alone from the GP / ML context a bit better i.e. ColVecs(X) just says that X is a matrix of column vectors, which doesn't have to have anything to do with statistics or ML.

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

@MichielStock when you get a minute, could you take a look at the new dev docs and let me know if they've covered this issue sufficiently? More than happy to add extra stuff / for you to raise a PR and add extra stuff.

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

@MichielStock is there anything left to address in this issue?

from stheno.jl.

MichielStock avatar MichielStock commented on May 29, 2024

I think the documentations have greatly improved! Can be closed.

Only think (but might be a different issue) is using precomputed kernels. I have some example code of how I did this, but might also be some framework.

from stheno.jl.

willtebbutt avatar willtebbutt commented on May 29, 2024

Cool. Would you mind raising a separate issue about that? It's not something I've really thought about, so it would be interesting to figure out whether or not it's something that fits naturally into Stheno.

from stheno.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.