Prediction for Big Data Through Kriging: Small Sequential and One-Shot Designs

CentER Discussion Paper Series No. 2018-022

43 Pages Posted: 30 Jul 2018

See all articles by Jack P. C. Kleijnen

Jack P. C. Kleijnen

Tilburg University, CentER

Wim C. M. van Beers

Tilburg University - Tilburg University School of Economics and Management

Date Written: July 9, 2018

Abstract

Kriging or Gaussian process (GP) modeling is an interpolation method that assumes the outputs (responses) are more correlated, the closer the inputs (explanatory or independent variables) are. A GP has unknown (hyper)parameters that must be estimated; the standard estimation method uses the "maximum likelihood" criterion. However, big data make it hard to compute the estimates of these GP parameters, and the resulting Kriging predictor and the variance of this predictor. To solve this problem, some authors select a relatively small subset from the big set of previously observed "old" data; their method is sequential and depends on the variance of the Kriging predictor. The resulting designs turn out to be "local"; i.e., most design points are concentrated around the point to be predicted. We develop three alternative one-shot methods that do not depend on GP parameters: (i) select a small subset such that this subset still covers the original input space albeit coarser; (ii) select a subset with relatively many but not all combinations close to the new combination that is to be predicted, and (iii) select a subset with the nearest neighbors (NNs) of this new combination. To evaluate these designs, we compare their squared prediction errors in several numerical (Monte Carlo) experiments. These experiments show that our NN design is a viable alternative for the more sophisticated sequential designs.

Keywords: Kriging; Gaussian Process; Big Data; Experimental Design; Nearest Neighbor

JEL Classification: C0; C1; C9; C15; C44

Suggested Citation

Kleijnen, Jack P.C. and Beers, Wim C. M. van, Prediction for Big Data Through Kriging: Small Sequential and One-Shot Designs (July 9, 2018). CentER Discussion Paper Series No. 2018-022, Available at SSRN: https://ssrn.com/abstract=3210567 or http://dx.doi.org/10.2139/ssrn.3210567

Jack P.C. Kleijnen (Contact Author)

Tilburg University, CentER ( email )

P.O. Box 90153
Tilburg, 5000 LE
Netherlands
+31 13 4662029 (Phone)
+31 13 4663377 (Fax)

HOME PAGE: http://https://sites.google.com/site/kleijnenjackpc/

Wim C. M. van Beers

Tilburg University - Tilburg University School of Economics and Management ( email )

P.O. Box 90153
Tilburg, 5000 LE
Netherlands

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
72
Abstract Views
1,035
Rank
585,002
PlumX Metrics