Conditional simulation of a spatial process
Generates exact (or approximate) random draws from the conditional distribution of a spatial process given specific observations. This is a useful way to characterize the uncertainty in the predicted process from data. This is known as conditional simulation in geostatistics or generating an ensemble prediction in the geosciences. sim.Krig.grid can generate a conditional sample for a large regular grid but is restricted to stationary correlation functions.
sim.spatialProcess(object, xp, M = 1, verbose = FALSE, ...) simSpatialData(object, M = 1, verbose = FALSE) sim.Krig(object, xp, M = 1, verbose = FALSE, ...) sim.Krig.approx(object, grid.list = NULL, M = 1, nx = 40, ny = 40, verbose = FALSE, extrap = FALSE,...) sim.mKrig.approx(mKrigObject, predictionPoints = NULL, predictionPointsList = NULL, simulationGridList = NULL, gridRefinement = 5, gridExpansion = 1 + 1e-07, M = 1, nx = 40, ny = 40, nxSimulation = NULL, nySimulation = NULL, delta = NULL, verbose = FALSE,...) sim.fastTps.approx(fastTpsObject, predictionPointsList, simulationGridList = NULL, gridRefinement = 5, gridExpansion = 1 + 1e-07, M = 1, delta = NULL, verbose=FALSE,...)
delta |
If the covariance has compact support the simulation method can
take advantage of this. This is the amount of buffer added for the simulation domain in the circulant embedding method.
A minimum size would be |
extrap |
If FALSE conditional process is not evaluated outside the convex hull of observations. |
fastTpsObject |
The output object returned by fastTps |
grid.list |
Grid information for evaluating the conditional surface as a grid.list. |
gridRefinement |
Amount to increase the number of grid points for the simulation grid. |
gridExpansion |
Amount to increase the size of teh simulation grid. This is used to increase the simulation domain so that the circulant embedding algorithm works. |
mKrigObject |
An mKrig Object |
M |
Number of draws from conditional distribution. |
nx |
Number of grid points in prediction locations for x coordinate. |
ny |
Number of grid points in prediction locations for x coordinate. |
nxSimulation |
Number of grid points in the circulant embedding simulation x coordinate. |
nySimulation |
Number of grid points in the circulant embedding simulation x coordinate. |
object |
A Krig object. |
predictionPoints |
A matrix of locations defining the points for evaluating the predictions. |
predictionPointsList |
A |
simulationGridList |
A |
xp |
Same as predictionPoints above. |
... |
Any other arguments to be passed to the predict function.
Usually this is the |
verbose |
If true prints out intermediate information. |
These functions generate samples from an unconditional or conditional multivariate (spatial)
distribution, or an approximate one. The unconditiona simulation function,
simSpatialData
,
is a handy way to generate synthetic observations from a fitted model.
Typically one would use these for a parametric bootstrap.
The functions that simulate conditional distributions arre much more invovled
in their coding. They are useful
for describing the uncertainty in predictions using the estimated spatial
process under Gaussian assumptions. An important assumption throughout
these functions is that all covariance parameters are fixed at their
estimated or prescribed values from the passed object. Although these functions might
be coded up easily by the users these
versions have the advantage hat theytake the mKrig
, spatialProcess
or
Krig
objects as a way to specify the
model in an unambiguous way.
Given a spatial process h(x)= P(x) + g(x) observed at
Y.k = Z(x.k)d + P(x.k) + g(x.k) + e.k
where P(x) is a low order, fixed polynomial and g(x) a Gaussian spatial process and Z(x.k) is a vector of covariates that are also indexed by space (such as elevation). Z(x.k)d is a linear combination of the the covariates with the parameter vector d being a component of the fixed part of the model and estimated in the usual way by generalized least squares.
With Y= Y.1, ..., Y.N, the goal is to sample the conditional distribution of the process.
[h(x) | Y ] or the full prediction Z(x)d + h(x)
For fixed a covariance this is just a multivariate normal sampling
problem. sim.Krig.standard
samples this conditional process at
the points xp
and is exact for fixed covariance parameters.
sim.Krig.grid
also assumes fixed covariance parameters and does
approximate sampling on a grid.
The outline of the algorithm is
0) Find the spatial prediction at the unobserved locations based on the actual data. Call this h.hat(x) and this is the conditional mean.
1) Generate an unconditional spatial process and from this process simluate synthetic observations. At this point the approximation is introduced where the field at the observation locations is approximated using interpolation from the nearest grid points.
2) Use the spatial prediction model ( using the true covariance) to estimate the spatial process at unobserved locations.
3) Find the difference between the simulated process and its prediction based on synthetic observations. Call this e(x).
4) h.hat(x) + e(x) is a draw from [h(x) | Y ].
sim.spatialProcess
Follows this algorithm exactly. For the case of an
addtional covariate this of course needs to be included. For a model
with covariates use drop.Z=TRUE
for the function to ignore prediction
using the covariate and generate conditional samples for just the spatial
process and any low order polynomial. Finally, it should be noted that this
function will also work with an mKrig
object because the essential
prediction information in the mKrig and spatialProcess objects are the same.
The naming is through convenience.
sim.Krig
Also follows this algorithm exactly but for the older Krig
object. Note the inclusion of
drop.Z=TRUE or FALSE will determine whether the conditional simulation
includes the covariates Z or not. (See example below.)
sim.Krig.approx
and sim.mKrig.approx
evaluate the
conditional surface on grid and simulates the values of h(x) off the
grid using bilinear interpolation of the four nearest grid
points. Because of this approximation it is important to choose the
grid to be fine relative to the spacing of the observations. The
advantage of this approximation is that one can consider conditional
simulation for large grids – beyond the size possible with exact
methods. Here the method for simulation is circulant embedding and so
is restricted to stationary fields. The circulant embedding method is
known to fail if the domain is small relative to the correlation
range. The argument gridExpansion
can be used to increase the
size of the domain to make the algorithm work.
sim.fastTps.approx
Is optimized for the approximate thin plate
spline estimator in two dimensions and k=2
. For efficiency the
ensemble prediction locations must be on a grid.
sim.Krig and sim.spatialProcess
a matrix with rows indexed by the locations in xp
and columns being the
M
independent draws.
sim.Krig.approx
a list with components x
, y
and z
.
x and y define the grid for the simulated field and z is a three dimensional array
with dimensions c(nx, ny, M)
where the
first two dimensions index the field and the last dimension indexes the draws.
sim.mKrig.approx
a list with predictionPoints
being the
locations where the field has been simulated.If these have been created from a
grid list that information is stored in the attributes
of predictionPoints
.
Ensemble
is a
matrix where rows index the simulated values of the field and columns
are the different draws, call
is the calling sequence. Not that if predictionPoints
has been omitted in the call or is created beforehand using make.surface.grid
it is
easy to reformat the results into an image format for ploting using as.surface
.
e.g. if simOut
is the output object then to plot the 3rd draw:
imageObject<- as.surface(simOut$PredictionGrid, simOut$Ensemble[,3] ) image.plot( imageObject)
sim.fastTps.approx
is a wrapper function that calls sim.mKrig.approx
.
Doug Nychka
sim.rf, Krig, spatialProcess
## Not run: ## A simple example for setting up a bootstrap ## M below should be ## set to much larger sample size ( e.g. M <- 1000) for better ## statistics data( ozone2) obj<- spatialProcess( ozone2$lon.lat,ozone2$y[16,] ) ######### boot strap set.seed(123) M<- 20 # create M indepedent copies of the observation vector ySynthetic<- simSpatialData( obj, M) bootSummary<- NULL for( k in 1:M){ # here the MLEs are found using the easy top level level wrapper # see mKrigMLEJoint for a more efficient strategy newSummary<- spatialProcess(obj$x,ySynthetic[,k], cov.params.start= list( theta = obj$theta.MLE, lambda = obj$lambda.MLE ) )$summary bootSummary<- rbind( bootSummary, newSummary) } # the results and 95 stats( bootSummary ) obj$summary confidenceInterval <- apply(bootSummary[,2:5],2, quantile, probs=c(0.025,0.975) ) # compare to estimates used as the "true" parameters obj$summary[2:5] print( confidenceInterval) ## End(Not run) ## Not run: # conditional simulation with covariates # colorado climate example data(COmonthlyMet) fit1E<- spatialProcess(CO.loc,CO.tmin.MAM.climate, Z=CO.elev ) # conditional simulation at missing data good<- !is.na(CO.tmin.MAM.climate ) infill<- sim.spatialProcess( fit1E, xp=CO.loc[!good,], Z= CO.elev[!good], M= 10) # get an elevation grid ... NGRID<- 50 gives a nicer image but takes longer NGRID <- 25 # get elevations on a grid COGrid<- list( x=seq( -109.5, -101, ,NGRID), y= seq(39, 41.5,,NGRID) ) COGridPoints<- make.surface.grid( COGrid) # elevations are a bilinear interpolation from the 4km # Rocky Mountain elevation fields data set. data( RMelevation) COElevGrid<- interp.surface( RMelevation, COGridPoints ) # NOTE call to sim.Krig treats the grid points as just a matrix # of locations the plot has to "reshape" these into a grid # to use with image.plot SEout<- sim.spatialProcess( fit1E, xp=COGridPoints, Z= COElevGrid, M= 30) # for just the smooth surface in lon/lat # SEout<- sim.spatialProcess( fit1E, xp=COGridPoints, drop.Z=TRUE, M= 30) # in practice M should be larger to reduce Monte Carlo error. surSE<- apply( SEout, 2, sd ) image.plot( as.surface( COGridPoints, surSE)) points( fit1E$x, col="magenta", pch=16) ## End(Not run) data( ozone2) set.seed( 399) # fit to day 16 from Midwest ozone data set. out<- Krig( ozone2$lon.lat, ozone2$y[16,], Covariance="Matern", theta=1.0,smoothness=1.0, na.rm=TRUE) # NOTE theta =1.0 is not the best choice but # allows the sim.rf circulant embedding algorithm to # work without increasing the domain. #six missing data locations xp<- ozone2$lon.lat[ is.na(ozone2$y[16,]),] # 5 draws from process at xp given the data # this is an exact calculation sim.Krig( out,xp, M=5)-> sim.out # Compare: stats(sim.out)[3,] to Exact: predictSE( out, xp) # simulations on a grid # NOTE this is approximate due to the bilinear interpolation # for simulating the unconditional random field. # also more grids points ( nx and ny) should be used sim.Krig.approx(out,M=5, nx=20,ny=20)-> sim.out # take a look at the ensemble members. predictSurface( out, grid= list( x=sim.out$x, y=sim.out$y))-> look zr<- c( 40, 200) set.panel( 3,2) image.plot( look, zlim=zr) title("mean surface") for ( k in 1:5){ image( sim.out$x, sim.out$y, sim.out$z[,,k], col=tim.colors(), zlim =zr) } ## Not run: data( ozone2) y<- ozone2$y[16,] good<- !is.na( y) y<-y[good] x<- ozone2$lon.lat[good,] O3.fit<- mKrig( x,y, Covariance="Matern", theta=.5,smoothness=1.0, lambda= .01 ) set.seed(122) O3.sim<- sim.mKrig.approx( O3.fit, nx=100, ny=100, gridRefinement=3, M=5 ) set.panel(3,2) surface( O3.fit) for ( k in 1:5){ image.plot( as.surface( O3.sim$predictionPoints, O3.sim$Ensemble[,k]) ) } # conditional simulation at missing data xMissing<- ozone2$lon.lat[!good,] O3.sim2<- sim.mKrig.approx( O3.fit, xMissing, nx=80, ny=80, gridRefinement=3, M=4 ) ## End(Not run) ## Not run: #An example for fastTps: data(ozone2) y<- ozone2$y[16,] good<- !is.na( y) y<-y[good] x<- ozone2$lon.lat[good,] O3FitMLE<- fastTpsMLE( x,y, theta=1.5 ) O3Obj<- fastTps( x,y, theta=1.5, lambda=O3FitMLE$lambda.MLE) # creating a quick grid list based on ranges of locations grid.list<- fields.x.to.grid( O3Obj$x, nx=100, ny=100) O3Sim<- sim.fastTps.approx( O3Obj,predictionPointsList=grid.list,M=5) # controlling the grids xR<- range( x[,1], na.rm=TRUE) yR<- range( x[,2], na.rm=TRUE) simulationGridList<- list( x= seq(xR[1],xR[2],,400), y= seq( yR[1],yR[2], ,400)) # very fine localized prediction grid O3GridList<- list( x= seq( -90.5,-88.5,,200), y= seq( 38,40,,200)) O3Sim<- sim.fastTps.approx( O3Obj, M=5, predictionPointsList=O3GridList, simulationGridList = simulationGridList) # check plot( O3Obj$x) US( add=TRUE) image.plot( as.surface( O3GridList,O3Sim$Ensemble[,1] ), add=TRUE) points( O3Obj$x, pch=16, col="magenta") ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.