gpstuff-dev / gpstuff Goto Github PK
View Code? Open in Web Editor NEWGPstuff - Gaussian process models for Bayesian analysis
Home Page: http://research.cs.aalto.fi/pml/software/gpstuff/
License: GNU General Public License v3.0
GPstuff - Gaussian process models for Bayesian analysis
Home Page: http://research.cs.aalto.fi/pml/software/gpstuff/
License: GNU General Public License v3.0
Is there an elegant way to access the covariance structure / function after having called gp_optim? Such as a function on the form cov(x1,x2)=fun(gp,x1,x2)?
matlab_install('SuiteSparseOn')
Warnings produced such as:
mc/linuxCsource/bbmean.c:39:10: warning: incompatible pointer types assigning to 'const int *' from 'const mwSize *' (aka 'const unsigned long *') [-Wincompatible-pointer-types]
dims = mxGetDimensions(prhs[0]);
^ ~~~~~~~~~~~~~~~~~~~~~~~~
bbmean.m
contains only:
error('No mex-file for this architecture. See Matlab help and convert.m in ./linuxCsource or ./winCsource for help.')
Is it not meant to be used in Mac OS?
I'm fitting a VAR GP on some Mauna Loa CO2 time series data, using a rather complicated kernel. Here is a snippet of the code I'm using:
m=160; %number of inducing points
signal_var=0.1;l1=0.10;sf1=0.1;cs2=1.00;l2=1.00;sf2=0.1;lper=1.00;sfper=0.1;l3=1.00;sf3=0.1;lrq=1.00;sfrq=0.1; per=1/x_std; alpha=2; %initial hyperparameter values
lik = lik_gaussian('sigma2', signal_var);
gpcf_se1 = gpcf_sexp('lengthScale', l1, 'magnSigma2',sf1);
gpcf_lin=gpcf_linear('coeffSigma2',cs2);
gpcf1=gpcf_prod('cf',{gpcf_se1,gpcf_lin});
gpcf_se2 = gpcf_sexp('lengthScale', l2, 'magnSigma2', sf2);
gpcf_per = gpcf_periodic('lengthScale',lper,'period',per,'magnSigma2',sfper);
gpcf2=gpcf_prod('cf',{gpcf_se2,gpcf_per});
gpcf_se3 = gpcf_sexp('lengthScale', l3, 'magnSigma2', sf3);
gpcf_rat = gpcf_rq('lengthScale',lrq,'magnSigma2',sfrq,'alpha',2);
gpcf3=gpcf_prod('cf',{gpcf_se3,gpcf_rat});
X_u=datasample(x,m,1,'Replace',false);
gp_var = gp_set('type', 'VAR', 'lik', lik, 'cf',{gpcf1,gpcf2,gpcf3},'X_u', X_u);
gp_var = gp_set(gp_var, 'infer_params', 'covariance+likelihood+inducing');
opt=optimset('TolFun',1e-3,'TolX',1e-4,'Display','iter','MaxIter',1000);
gp_var=gp_optim(gp_var,x,y,'opt',opt);
For some values of X_u, the optimiser gets stuck at the following section of 'fminscg.m' (lines184-189):
while ((any(isinf(gplus)) || any(isnan(gplus)))) && ~isnan(fold)
sigma=2*sigma;
kappa=0.5.^2*kappa;
xplus=x+sigma*d;
[tmp,gplus] = fun(xplus);
end
Here fun is @(ww)gp_eg(ww,gp,x,y,'z',z), so tmp and gplus are the energy and its gradients with respect to the hyperparameters and the inducing points. Both are constantly computed to be NaN, which gets the while loop stuck. Have you ever experienced this problem before? Do you have any suggestions?
The following toy example doesnt work, what I'm doing wrong?
%% code
clear; close all; clc; rng(42);
% - define ground truth
xlb = 0;
xub = 10;
objfcn = @(x) 2*x+1;
xgrid = linspace(xlb, xub, 1000)';
fgrid = objfcn(xgrid);
% - create data
ndata = 99;
s2nfcn = @(x) 0.01*x.^2+0.01;
x = sort(xlb+(xub-xlb)*rand(ndata, 1));
y = objfcn(x)+sqrt(s2nfcn(x)).*randn(ndata, 1);
% - set up gaussian process structure
gp = gp_set('lik', lik_inputdependentnoise, 'cf', {gpcf_sexp gpcf_exp}, 'jitterSigma2', 1e-8, 'comp_cf', {[1] [2]});
gp = gp_optim(gp, x, y);
[~, ~, ~, ymean, yvar] = gp_pred(gp, x, y, xgrid);
% - visualize
figure(1);
hold on; grid on;
plot(xgrid, fgrid, 'g');
plot(x, y, 'k+');
plot(xgrid, ymean, 'b');
plot(xgrid, ymean+3*sqrt(yvar), 'b--');
plot(xgrid, ymean-3*sqrt(yvar), 'b--');
The gp_jpred function says to provide an optional input tstind = [] for the FIC approximation if none of the test points are also in the training set. However, when I try to use this function with an FIC model I get the error:
Error using gp_jpred (line 425)
tstind not provided.
This error is occurring with GPStuff4.7.
This is minor, but in expectedimprovement_eg.m, lines 126-127 (derivative of Ef and Varf wrt. x), indexing should go from 1 to size(x_new,2), not size(x_new,1) as it is now [size(x_new,1) is asserted to be 1 for the gradient to be computed].
I am testing the new svigp function, for small datasets of the order 7000 points the function works well, but for high amount of data 70,000 points I am constantly error like the one below.
More over the error can be thrown after a few iterations (basically its pretty random), can you tell me how to solve this issue?
iter=1/1000, e=-534639.756, de=Inf
iter=11/1000, e=65878.426, de=18512
Bad parameter values, decreasing mu2.
Error using GPSVI_G
The value of 'gpsvi_e_param' is invalid. It must satisfy the function: isstruct.
Error in gpsvi_g (line 40)
ip.parse(w, gp, x, y, varargin{:});
Error in svigp (line 296)
g = gpsvi_g(w,gp,x(inds{i},:),y(inds{i},:), 'z', zi, ...
I encountered an error when I ran the following example code. The code basically computes two GPs, one uses maximum likelihood estimation for the underlying hyperparameters and the other one uses samples of the hyperparameter posterior. It also visualizes the predictive distributions for both GPs.
%% code
clear; close all; clc; rng(42);
% - define ground truth
xlb = 0.5;
xub = 2.5;
truefcn = @(x) sin(10*pi*x)./(2*x)+(x-1).^4; % gramacy and lee testfunction
xgrid = linspace(xlb, xub, 1000)';
fgrid = truefcn(xgrid);
% - create data
N = 444;
s2n = 0.01;
x = sort(xlb+(xub-xlb)*rand(N, 1));
y = truefcn(x)+sqrt(s2n)*randn(N, 1);
% - use maximum likelihood
gp1 = gp_set('lik', lik_gaussian('sigma2', 0.1), 'cf', gpcf_sexp('lengthScale', 1, 'magnSigma2', 0.1));
gp1 = gp_optim(gp1, x, y);
[~, ~, ~, ymean1, yvar1] = gp_pred(gp1, x, y, xgrid);
% - use posterior sampling
lik = lik_gaussian('sigma2', 1, 'sigma2_prior', prior_gaussian('mu', 0.01, 's2', 1));
gpcf = gpcf_sexp('lengthScale', 1, 'magnSigma2', 1, ...
'lengthScale_prior', prior_gaussian('mu', 1, 's2', 1), 'magnSigma2_prior', prior_gaussian('mu', 0.01, 's2', 1));
gp2 = gp_set('lik', lik, 'cf', gpcf);
gp2 = thin(gp_mc(gp2, x, y, 'nsamples', 25), 5);
[~, ~, ~, ymean2, yvar2] = gp_pred(gp2, x, y, xgrid); % <--- triggers the error
% - visualize
figure(1);
hold on; grid on;
plot(xgrid, fgrid, 'g');
plot(x, y, 'k+');
plot(xgrid, ymean1, 'b');
plot(xgrid, ymean2, 'r');
plot(xgrid, ymean1+3*sqrt(yvar1), 'b--');
plot(xgrid, ymean1-3*sqrt(yvar1), 'b--');
plot(xgrid, ymean2+3*sqrt(yvar2), 'r--');
plot(xgrid, ymean2-3*sqrt(yvar2), 'r--');
After debugging I found the corresponding error-message in gpmc_preds.m:
if nargout > 2 && isempty(yt)
error('mc_pred -> If lpy is wanted you must provide the vector yt as an optional input.')
end
My workaround to make things work is to uncomment this if-condition and adjust the code around line 238 like this:
if ~isempty(yt)
[Ef(:,i1), Varf(:,:,i1), lpy(:,i1), Ey(:,i1), Vary(:,:,i1)] = gp_pred(Gp, x, y, xt, options);
else
[Ef(:,i1), Varf(:,:,i1), ~, Ey(:,i1), Vary(:,:,i1)] = gp_pred(Gp, x, y, xt, options);
lpy=[];
end
I have a method which uses the fminlbfgs.m file frequently. Occasionally, I have found that infinite loops occur. I believe these are coming from several lines such as 452, 463, 624, 636, 745, 758 where there are no set maximum iterations. I believe the infinite loops occur because there are no finite numbers when the loop is started (e.g. all the values are infinite or NaN), and I'm guessing that the functions which take these values are unable to get finite values from these bad starting values.
I've modified the code for my purposes to break the loop after a certain number of iterations and provide a specific error code from which I can make adjustments for my own purposes, but a more general solution to make the code more infinite-loop proof may be useful.
Looking at the regression demos, there is a clear trend that the means of all priors are kept at the default (0), while the initial guesses are different (see, for instance, demo_regression_additive1.m). Is this supposed to indicate that the actual prior used in gp_optim is prior+initial guess? Otherwise it seems somewhat odd to use a student's t distribution centered around 0 as a prior for a length scale, which necessarily must be positive.
if I have deduced the behavior correctly, I think it should be described a bit more prominently. Possibly appended to the help-texts for the priors functions.
Hi,
First of all, great work. This is not really an issue (maybe), just a question: I was having a look at the gradient equations for SVIGP, and it seems that the derivative matrices (DKuu, etc) are receiving the covariance matrices themselves (K_uu, etc) after the call to the covariance function. Maybe I am missing the point as to why, could you maybe point me to some reference showing this gradient derivation?
Hi
I am running a simple regression and I have this error
Error using GPEP_E
The value of 'w' is invalid. It must satisfy the function:
@(x)isempty(x)||(ischar(x)&&ismember(x,{'init','clearcache'}))||(isvector(x)&&isreal(x)&&all(isfinite(x)))||all(isnan(x)).
I would like to understand in which condition of the data this happen.
Hi, I am not sure whether the software is still mantained, but anyway.
I'm having troubles using DTC, SOR and VAR sparse approxiamtions with the EP latent model for classificaiton. I believe that the problem comes from a missing sign in line 767 of gpep_pred.m
Currently it says:
Varft = sum(B2'.(B(repmat(La,1,m).\B')B2)',2) + sum((K_nu(K_uu(K_fu'*L))).^2, 2);
But this gives negative variance values. I think the correct formula might be:
Varft = sum(B2'.(B(repmat(La,1,m).\B')B2)',2) - sum((K_nu(K_uu(K_fu'*L))).^2, 2);
But I don't have a great knowledge of sparse GP, so I might be wrong and the problem might be something else.
Also the Laplace method doesn't work with any sparse methods, and I get some matrix dimensionality problem, but I haven't investigated much.
Thank you!
From Lu Cheng:
"It was said in GPStuff manual page 42 that periodic kernel was coming
from this paper
http://jmlr.org/proceedings/papers/v33/solin14.pdf
In page 907, equation (23) and GPStuff appendix, there is the canonical
periodic covariance function. And it is not obvious to find the explicit
form of quasi-periodic covariance function in section 3.5.
In the demo_periodic.m, there is alway the decay term, i.e. another SE
term, which is very confusing. I suggest add an additional example which
set decay to 0 and not including the extra length scale."
Hi there,
It seems the development has ceased for quite a while. Could anyone provide more information about the development? Like a rough roadmap? Current known problems/issues/bugs? It will be greatly helpful if someone wants to pick up from the last commit.
By the way, is there any fortune that someone would rewrite everything in the MATLAB OOP paradigm?
Best regards.
Does not work (at least for me). Problem with nested handles reported (in the lik_studentt definition file) if you follow the script on the GPstuff page. There are other minor problems before that but they are easily fixed. I assume that this demo has not been run against octave yet.
Should (!) easily be fixable (just introduce sub functions) - shall likely do it myself!
Hello!
Thanks for making your code public. Is there any part of this code suppose to include the Inverse-Distance Kernel as it is suggested in this publication?
Minimum Mode Saddle Point Searches Using Gaussian Process
Regression with Inverse-Distance Covariance Function
https://pubs.acs.org/doi/10.1021/acs.jctc.9b01038
And if so, may I have access to it?
Thanks!
I encountered most curious behaviour when trying to do a MCMC prediction at just one point:
[gp_rec,g,opt] = gp_mc(gp, etc) % gp: lik_gaussian, gpcf_sexp
xt = 0.5; % instead of xt = linspace(something)
[Eft_mc, Varft_mc] = gp_pred(gp_rec, x, y, xt, etc)
Eft_mc
is 1x1 as expected, but Varft_mc is a vector of size Nx1 (N being the number of samples), which I don't quite think is the intended behaviour. (Eft and Varft always have correctly a same dimension as xt
if it has more than one element.)
Also, I think I found where the bug is: in gpmc_pred
we call gpmc_preds
[Efs, Varfs, lpys, Eys, Varys] = gpmc_preds(gp, x, y, varargin{:});
and at the end of gpmc_preds.m
singleton-dimensions of Vary
and Varf
are squeezed out:
Varf=squeeze(Varf);
Vary=squeeze(Vary);
In the simple lik_gaussian
case they were populated in the loop
if isfield(gp, 'latentValues') && ~isempty(gp.latentValues)
% [...skip]
else
% Gaussian likelihood or Laplace/EP for latent values
if nargout <= 2
[Ef(:,i1), Varf(:,:,i1)] = gp_pred(Gp, x, y, xt, options);
elseif nargout <=3
[Ef(:,i1), Varf(:,:,i1), lpy(:,i1)] = gp_pred(Gp, x, y, xt, options);
else
[Ef(:,i1), Varf(:,:,i1), lpy(:,i1), Ey(:,i1), Vary(:,:,i1)] = gp_pred(Gp, x, y, xt, options);
end
end
Now if xt = 1
, after the loop, Ef (and Ey) are sized 1 x k and variances Varf (and Vary) 1 x 1 x k. Because of peculiar quirk of squeeze.m, the squeezed Varf
(and Vary
) get turned into k x 1 vectors while Ef
(and Ey
) remain 1 x k row vectors (which are then returned).
Thus when later in gpmc_pred
their means are taken along the 2nd dimension,
Ef=mean(Efs,2);
Varf=mean(Varfs,2) + var(Efs,0,2);
Ey=mean(Eys,2);
Vary=mean(squeeze(Varys),2) + var(Eys,0,2);
Ef, Ey
end up being 1x1, but mean(Varys,2)
remain sized Nx1.
Hi,
I got the following error while running the test code from LonGP:
>> lonGP('./test/output',1)
processing target 1: y.
in conStep1
Selecting 0th continuous variable.
toSelVarInds: 1
Resuming from rep=1 mcmc_round=4.
Undefined function or variable 'neff'.
Error in psrf (line 108)
neff=min(neff,N*M);
Error in runMCMC (line 97)
R = psrf(thetaArr{:});
Error in runMcmcInfer (line 26)
[R, rfull, flag] = runMCMC(gp,xmn,ymn,nRep,tmpresfile);
Error in conStep1 (line 44)
runMcmcInfer(currVarFlagArr, 1);
Error in lonGP (line 173)
currNextFun(currNextArg{:});
I think this was caused by the last line neff=min(neff,N*M);
being outside the function definition in diag/psrf.m
. I was running the code on MATLAB R2019a with macOS 10.15.7, using the gpstuff dev branch. Could you look into this? Thanks a lot!
When I run lgpdens on the attached data, I get the error below. I am able to overcome this issue by choosing the speedup option ('speedup','on'), but resulting density is not smooth enough (this may be due to the fact the observed data is largely integer based with repeated observations). Is there another way around this error, or is there a better way to somehow get the smoothing parameters to produce a more realistic density estimate?
Here's the error:
Error using GPLA_E
The value of 'w' is invalid. It must satisfy the function:
@(x)isempty(x)||(ischar(x)&&strcmp(w,'init'))||isvector(x)&&isreal(x)&&all(isfinite(x))||all(isnan(x)).
Error in gpla_e (line 79)
ip.parse(w, gp, varargin{:});
Error in gpla_g (line 231)
[e, edata, eprior, p] = gpla_e(gp_pak(gp), gp, x, y, 'z', z);
Error in gp_g (line 38)
[g] = fh_g(w, gp, x, y, varargin{:});
Error in gp_eg (line 40)
g=gp_g(w, gp, x, y, varargin{:});
Error in gp_optim>@(ww)gp_eg(ww,gp,x,y,'z',z) (line 66)
fh_eg=@(ww) gp_eg(ww, gp, x, y, 'z', z);
Error in fminlbfgs>gradient_function (line 890)
[fval, grad]=feval(funfcn,reshape(x,data.xsizes));
Error in fminlbfgs>bracketingPhase (line 744)
[data,f_alpha, grad]=gradient_function(data.xInitial(:)+alpha*data.dir(:),funfcn, data, optim);
Error in fminlbfgs>linesearch (line 583)
data = bracketingPhase(funfcn, data,optim);
Error in fminlbfgs (line 279)
data=linesearch(funfcn, data,optim);
Error in gp_optim (line 121)
w = optimf(fh_eg, w, opt);
Error in lgpdens>gpsmooth (line 983)
gp=gp_optim(gp,xx,yy,'opt',opt, 'optimf', @fminlbfgs);
Error in lgpdens (line 163)
gp=gpsmooth(xxn,yy,gpcf,latent_method,int_method,display,speedup,gridn,cond_dens,basis,imp_sampling);
HERE ARE THE DATA:
18
15
18
16
17
15
15
14
15
14
22
18
16
17
19
18
18
19
18
17
15
18
21
14
15
13
18
16
18
18
23
18
19
15
20
11
20
19
15
16
16
18
19
18
15
15
16
15
18
21
20
13
20
18
19
23
22
22
22
24
22.5
20
18
18.5
17.5
20
19
19
16.5
13
13
17.5
17.5
20.5
19
18.5
22
when I am install the toolbox on windows, there is no winCsource in the file, is it delete in this file
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.