Comments (12)
This seems to work. I will make a page to implement your own kernel. Then we might see how to contribute more kernels for discrete structures.
from stheno.jl.
Ish -- in Stheno, all inputs are subtypes of
AbstractVector
. The example in the docs only considersAbstractVector{<:Real}
, but purely because it's the most straightforward case.
OK
For high-dimensional inputs stored in a
p
xN
design matrix we have theColsAreObs
struct. ThisColsAreObs(X)
is anAbstractVector{Vector{<:Real}}
where each column ofX
corresponds to an input. You'll see that most of the kernels are specialised to this type.
Ah, I thought so. This is indeed a good approach because, ideally, you would like to make predictions about arbitrary abstract objects such as graphs, strings, or whatever.
The reason for doing this is to make it completely unambiguous how Stheno will interpret user data. It's also to avoid people complaining that the package adopts the "wrong" convention regarding whether a design matrix should be
N x p
orp x N
--ColsAreObs
says it'sp
xN
and we could add aRowsAreObs
type with the opposite convention.
Well, RowsAreObs
is the correct one of course ;-)
What do you think would be the most helpful way to explain this? Perhaps we should just add a section to the docs entitled
Input Types
.
Best add an example with high-dimensional prediction.
p.s.
ColsAreObs
possibly isn't the best name -- if you can think of an improved name I would greatly appreciate it. PerhapsColVectors(X)
andRowVectors(X)
would be better?
ColsAreObs
is good in the sense it is unambiguous. Maybe the latter is a bit more correct in nomenclature?
from stheno.jl.
Hi @MichielStock thanks for raising the issue. Currently working towards the AISTATS deadline, will get back to you about this tomorrow :)
from stheno.jl.
Just tried to run your example. I firstly realised that you've not imported ew
or pw
in your script, so you'll need to do that. You can first test that the JaccardKernel
is working as expected with something like
using Stheno: GPC
f = GP(JaccardKernel(), GPC())
x = [[1,1], [2,1]]
rand(f(x)) # noise-free
rand(f(x, 0.1)) # noisy
I couldn't immediately get your code to work with the EQ
kernel as it doesn't know how to handle inputs of type typeof(x)
, but we could definitely make something to make this work if that's something you're interested in doing?
edit: if you're interested in contributing the Jaccard kernel, I would very much open to a PR btw. And this might be a good candidate for an implement-your-own kernel tutorial in the docs...
from stheno.jl.
Glad this works for you. I've got a draft of a how-to-implement-your-own-kernel page in the works at the minute and will open a PR involving it shortly for your feedback :)
from stheno.jl.
I think the example in the docs is an improvement.
Am I correct that the base case is always with a univariate input, i.e. pw(k::EQ, xl::AbstractVector{<:Real}, xr::AbstractVector{<:Real})
, so x
is a vector and is not high-dimensional by default. So if you want to regress X on y, where X is n x p, this is not supported?
(you can do it by having x
a vector of feature vectors, e.g., [[1,2,3], [1,3,3], [3, 4, 2]]
, but this does not seem to be very elegant)
from stheno.jl.
Ish -- in Stheno, all inputs are subtypes of AbstractVector
. The example in the docs only considers AbstractVector{<:Real}
, but purely because it's the most straightforward case.
For high-dimensional inputs stored in a p
x N
design matrix we have the ColsAreObs
struct. This ColsAreObs(X)
is an AbstractVector{Vector{<:Real}}
where each column of X
corresponds to an input. You'll see that most of the kernels are specialised to this type.
The reason for doing this is to make it completely unambiguous how Stheno will interpret user data. It's also to avoid people complaining that the package adopts the "wrong" convention regarding whether a design matrix should be N x p
or p x N
-- ColsAreObs
says it's p
x N
and we could add a RowsAreObs
type with the opposite convention.
What do you think would be the most helpful way to explain this? Perhaps we should just add a section to the docs entitled Input Types
.
p.s. ColsAreObs
possibly isn't the best name -- if you can think of an improved name I would greatly appreciate it. Perhaps ColVectors(X)
and RowVectors(X)
would be better?
from stheno.jl.
Best add an example with high-dimensional prediction.
Sounds good. Will give this a go either tonight or tomorrow.
ColsAreObs is good in the sense it is unambiguous. Maybe the latter is a bit more correct in nomenclature?
Glad you agree that it's unambiguous. I think I'm going to refactor and go with ColVecs
and RowVecs
since they're a) a little bit more concise and b) stand alone from the GP / ML context a bit better i.e. ColVecs(X)
just says that X
is a matrix of column vectors, which doesn't have to have anything to do with statistics or ML.
from stheno.jl.
@MichielStock when you get a minute, could you take a look at the new dev docs and let me know if they've covered this issue sufficiently? More than happy to add extra stuff / for you to raise a PR and add extra stuff.
from stheno.jl.
@MichielStock is there anything left to address in this issue?
from stheno.jl.
I think the documentations have greatly improved! Can be closed.
Only think (but might be a different issue) is using precomputed kernels. I have some example code of how I did this, but might also be some framework.
from stheno.jl.
Cool. Would you mind raising a separate issue about that? It's not something I've really thought about, so it would be interesting to figure out whether or not it's something that fits naturally into Stheno.
from stheno.jl.
Related Issues (20)
- Example using Turing + Stheno HOT 4
- Numerical issues HOT 3
- TagBot trigger issue HOT 28
- 0.7 Roadmap
- Docs
- Precompiling Error HOT 3
- restrict observations to a specific (sub-) process HOT 7
- Kernel vs GP composition HOT 6
- 1-Dimensional Euclidean Space example doesn't work HOT 11
- Translate sklearn to Julia request HOT 9
- Contribute to Wikipedia's Comparison of Gaussian process software HOT 1
- Simple linear algebra for CompositeGP construction HOT 2
- Stheno.jl + Turing.jl on Julia 1.7
- Stheno.jl + Flux.jl examples
- [QUESTION] Stheno or AbstractGPs.jl? HOT 6
- Tutorial crashes in optimization HOT 2
- Supplying a distance matrix HOT 2
- Can anyone provide an example of using Stheno for multi-input GP regression? HOT 7
- BlockArray pullback is inadequate HOT 2
- Working with arbitrary multivariate mean HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stheno.jl.