Daniel Gp Example

Setup

library("nimble")
set.seed(0)

Randomly generate some sinusoidal data

x <- 1:100
y <- sin(x/5) + rnorm(100, 0.1)
ind <- sort(sample(1:100, 40))

The next three lines are the only inputs required.

xObs <- x[ind]                ## input
yObs <- y[ind]                ## input
xPred <- c(x, 101:120)        ## input

Some initial processing

nObs <- length(xObs)
nPred <- length(xPred)
f <- function(xi, xj) (xi-xj)^2
diffOO <- outer(xObs,  xObs,  f)
diffPP <- outer(xPred, xPred, f)
diffPO <- outer(xPred, xObs,  f)

NIMBLE function for GP prediction

Here’s the main function for doing the prediction from the GP model.

It takes the MCMC samples as a runtime argument, so this can be iterated with running the MCMC for different numbers of iterations, initial values, or even different datasets!

gpPred <- nimbleFunction(
    setup = function(model, params) {
        calcNodes <- model$getDependencies(params, determOnly = TRUE)
        nPred <- dim(model$SigPP)[1]
        E <- array(0, c(nPred, 1))
        C <- array(0, c(nPred, nPred))
    },
    run = function(samples = double(2)) {
        E <<- E * 0
        C <<- C * 0
        nSamples <- dim(samples)[1]
        for(i in 1:nSamples) {
            values(model, params) <<- samples[i,]
            calculate(model, calcNodes)
            intermediate <- model$SigPO %*% inverse(model$SigOO)
            Etemp <- intermediate %*% asCol(model$yObs)
            Ctemp <- model$SigPP - intermediate %*% t(model$SigPO)
            E <<- E + Etemp
            C <<- C + Ctemp
        }
        E <<- E / nSamples
        C <<- C / nSamples
    },
    methods = list(
        getE = function() { returnType(double(1)); return(E[,1]) },
        getC = function() { returnType(double(2)); return(C[, ]) }
    )
)

GP model

GP model defined here. Basically the same as your original, but I renamed some paramters more to my liking =)

This is not as elegant as I had hoped. It still requires repeating (essentially) the same code three times. I discovered some limitations of NIMBLE while trying other approaches.

Bottom line:
- This achieves relatively nice simplicity.
- There’s an efficiency hit to the MCMC sampling, but it doesn’t seem too bad.
- This works.

code <- nimbleCode({
   rho ~ dgamma(10, 1)
   sigGP ~ dunif(0, 1e5)
   sigOE ~ dunif(0, 1e5)
   SigOO[1:nObs, 1:nObs ] <- sigGP^2*exp(-1/2*diffOO[1:nObs, 1:nObs ]/rho^2) + sigOE^2*IOO[1:nObs, 1:nObs ]
   SigPP[1:nPred,1:nPred] <- sigGP^2*exp(-1/2*diffPP[1:nPred,1:nPred]/rho^2) + sigOE^2*IPP[1:nPred,1:nPred]
   SigPO[1:nPred,1:nObs ] <- sigGP^2*exp(-1/2*diffPO[1:nPred,1:nObs ]/rho^2)
   yObs[1:nObs] ~ dmnorm(mu[1:nObs], cov = SigOO[1:nObs,1:nObs])
})

constants <- list(nObs=nObs, nPred=nPred, diffOO=diffOO, diffPP=diffPP, diffPO=diffPO,
                  IOO=diag(nObs), IPP=diag(nPred), mu=rep(0,nObs))

data <- list(yObs = yObs)

inits <- list(rho = 1, sigGP = 1, sigOE = 1)

Rmodel <- nimbleModel(code, constants, data, inits)

Create, compile, and run

## MCMC specification with no samplers
spec <- configureMCMC(Rmodel, nodes = NULL)

## this will be used for some checking, and setting up block sampler:
params <- Rmodel$getNodeNames(topOnly = TRUE)

## NOTE: the next line shouldn't work for you, since the MCMC API has changed.
## you can rebuild from source off nimble branch 'devel'.
## otherwise, using last public release of NIMBLE (0.3-1), use this line instead:
## spec$addSampler('RW_block', control = list(targetNodes = params))
spec$addSampler(params, 'RW_block')

[1] RW_block sampler: rho, sigGP, sigOE,  adaptive: TRUE,  adaptScaleOnly: FALSE,  adaptInterval: 200,  scale: 1,  propCov: identity

We can debate about univariate vs. block sampling at some point.

## MCMC function
Rmcmc <- buildMCMC(spec)

## GP prediction function
## also uses the 'params' variable for specialization
Rpred <- gpPred(Rmodel, params)

## compile everything
Cmodel <- compileNimble(Rmodel)
Cmcmc  <- compileNimble(Rmcmc, project = Rmodel)
Cpred  <- compileNimble(Rpred, project = Rmodel)

## MCMC sampling
## 100,000 iterations
system.time(Cmcmc$run(100000))

   user  system elapsed 
 49.068   0.016  49.057

About 25 seconds on my computer

That can be improved if necessary

samples <- as.matrix(Cmcmc$mvSamples)

## quick check
if(!identical(params, colnames(samples))) stop('problem')

## predict from GP model using posterior MCMC samples
system.time(Cpred$run(samples))

   user  system elapsed 
 64.236   0.000  64.188

About 40 seconds on my computer

Again, could streamline the gpPred() function if necessary

## extract predictions: E and C
E <- Cpred$getE()
C <- Cpred$getC()

Output

  [1]  1.048953470  1.063690923  1.056571565  1.025025296  0.967727602
  [6]  0.885088092  0.779046134  0.652750089  0.510403944  0.357145489
 [11]  0.198827230  0.041501513 -0.109026506 -0.247533773 -0.369716151
 [16] -0.472313950 -0.553105378 -0.610874554 -0.645461105 -0.657719603
 [21] -0.649076810 -0.621198765 -0.575868588 -0.514737033 -0.439105417
 [26] -0.349971602 -0.248371381 -0.135451833 -0.012707283  0.117353661
 [31]  0.251084159  0.383920114  0.510567530  0.625303732  0.722419735
 [36]  0.796925448  0.845259818  0.865448082  0.857113606  0.821872152
 [41]  0.763535592  0.687360014  0.599297100  0.505480001  0.411259520
 [46]  0.320288380  0.234175561  0.152695050  0.074286902 -0.003167262
 [51] -0.081466127 -0.161638385 -0.243673849 -0.325876291 -0.404833854
 [56] -0.476477835 -0.536583186 -0.580993801 -0.606153844 -0.609554981
 [61] -0.590206477 -0.548974521 -0.488316837 -0.411928777 -0.324540013
 [66] -0.231641132 -0.139114920 -0.052860394  0.021550784  0.079331411
 [71]  0.117042000  0.132978712  0.127078948  0.100689033  0.056355647
 [76] -0.002485485 -0.071873133 -0.147591543 -0.225248038 -0.300383786
 [81] -0.368679284 -0.425886710 -0.467685571 -0.489815787 -0.488669528
 [86] -0.462038368 -0.409587331 -0.332776473 -0.234523765 -0.119090537
 [91]  0.008395849  0.142692615  0.278921123  0.412628846  0.539726698
 [96]  0.656454950  0.759426449  0.845730518  0.912939607  0.959256645
[101]  0.984015990  0.987978885  0.972988593  0.941606557  0.896909969
[106]  0.842215301  0.780777080  0.715559458  0.649099436  0.583448236
[111]  0.520171088  0.460385520  0.404821033  0.353887176  0.307741235
[116]  0.266350221  0.229544512  0.197062243  0.168584714  0.143763668

diag(C)

  [1] 1.2679882 1.2064883 1.1541432 1.1116728 1.0784248 1.0524884
  [7] 1.0317576 1.0147224 1.0007923 0.9900904 0.9827438 0.9787769
 [13] 0.9777564 0.9783522 0.9785871 0.9766328 0.9714366 0.9630053
 [19] 0.9522229 0.9402332 0.9284669 0.9183767 0.9109672 0.9068412
 [25] 0.9062843 0.9093517 0.9158928 0.9254797 0.9372421 0.9497874
 [31] 0.9614992 0.9705821 0.9757483 0.9766685 0.9740220 0.9691475
 [37] 0.9633160 0.9577090 0.9532083 0.9500621 0.9480639 0.9471172
 [43] 0.9473233 0.9485876 0.9506984 0.9530274 0.9546481 0.9549896
 [49] 0.9542985 0.9535398 0.9539394 0.9568409 0.9630692 0.9727686
 [55] 0.9850166 0.9979048 1.0093360 1.0177498 1.0224490 1.0235779
 [61] 1.0216894 1.0172186 1.0108482 1.0037543 0.9975825 0.9940451
 [67] 0.9942810 0.9988192 1.0071728 1.0181157 1.0297591 1.0394083
 [73] 1.0443944 1.0429170 1.0346484 1.0209019 1.0041130 0.9869394
 [79] 0.9713413 0.9580174 0.9466908 0.9370265 0.9291060 0.9233148
 [85] 0.9204774 0.9213396 0.9262437 0.9349512 0.9465634 0.9595347
 [91] 0.9721773 0.9829212 0.9902883 0.9935894 0.9933386 0.9913798
 [97] 0.9906718 0.9946403 1.0070336 1.0312845 1.0694759 1.1218007
[103] 1.1866870 1.2612294 1.3418003 1.4247225 1.5067538 1.5853383
[109] 1.6586720 1.7256456 1.7857277 1.8388290 1.8851737 1.9251909
[115] 1.9594282 1.9884878 2.0129814 2.0334994 2.0505931 2.0647637

Black dots are the original ‘xObs’ and ‘yObs’

Red dots are the predictions made at each ‘xPred’

Red lines are plus/minus one standard error