A Bayes Linear Emulation

In the hmer package we adopt a Bayes Linear approach to build emulators. While a full Bayesian analysis requires specification of a full joint prior probability distribution to reflect beliefs about uncertain quantities, in the Bayes linear approach expectations are taken as a primitive and only first and second order specifications are needed when defining the prior. Operationally, this means that one just sets prior mean vectors and covariance matrices for the uncertain quantities, without having to decide exactly which distribution is responsible for the chosen mean and covariance. A Bayes Linear analysis may therefore be viewed as a pragmatic approach to a full Bayesian analysis, where the task of specifying beliefs has been simplified. As in any Bayesian approach, our priors (mean vectors and covariance matrices) are then adjusted by the observed data.

The Bayes linear approach to statistical inference takes expectation as primitive. Suppose that there are two collections of random quantities, \(B = (B_1,\dots,B_r)\) and \(D =(1,D_1, \dots D_s)\). Bayes linear analysis involves updating subjective beliefs about \(B\) given observation of \(D\). In order to do so, prior mean vectors and covariance matrices for \(B\) and \(D\) (that is \(E[B], E[D], Var[B]\) and \(Var[D]\)), along with a covariance matrix between \(B\) and \(D\) (that is \(Cov[B,D]\)), must be specified.

The Bayes linear update formulae for a vector B given a vector D are: \[\begin{align} E_D[B] &= E[B] + Cov[B,D]Var[D]^{-1}(D - E[D]) \\ Var_D[B] &= Var[B] - Cov[B,D]Var[D]^{-1}Cov[D,B] \\ Cov_D[B_1,B_2] &= Cov[B_1,B_2] - Cov[B_1,D]Var[D]^{-1}Cov[D,B_2]. \end{align}\] \(E_D[B]\) and \(Var_D[B]\) are termed the adjusted expectation and variance of \(B\) given \(D\). \(Cov_D[B_1,B_2]\) is termed the adjusted covariance of \(B_1\) and \(B_2\) given \(D\), where \(B_1\) and \(B_2\) are subcollections of \(B\). The formula given for \(E_D(B)\) represents the best linear fit for \(B\) given \(D\) in terms of minimising the expected squared loss functions \(E[(B_k-a_k^TD)^2]\) over choices of \(a_k\) for each quantity in \(B; k = 1,\dots,r\), that is, the linear combination of \(D\) most informative for \(B\).