4 ‘wave0’ - parameter ranges, targets and design points
In this section we set up the emulation task, defining the input parameter ranges, the calibration targets and all the data necessary to build the first wave of emulators.
For the sake of clarity, in this workshop we will adopt the word ‘data’ only when referring to the set of runs of the model that are used to train emulators. Real-world observations, that inform our choice of targets, will instead be referred to as ‘observations.’ Before we tackle the emulation, we need to define some objects. First of all, let us set the parameter ranges:
= list(
ranges b = c(1e-5, 1e-4), # birth rate
mu = c(1e-5, 1e-4), # rate of death from other causes
beta1 = c(0.2, 0.3), # infection rate at time t=0
beta2 = c(0.1, 0.2), # infection rates at time t=100
beta3 = c(0.3, 0.5), # infection rates at time t=270
epsilon = c(0.07, 0.21), # rate of becoming infectious after infection
alpha = c(0.01, 0.025), # rate of death from the disease
gamma = c(0.05, 0.08), # recovery rate
omega = c(0.002, 0.004) # rate at which immunity is lost following recovery
)
We then turn to the targets we will match: the number of infectious individuals \(I\) and the number of recovered individuals \(R\) at times \(t=25, 40, 100, 200, 200, 350\). The targets can be specified by a pair (val, sigma), where ‘val’ represents the measured value of the output and ‘sigma’ represents its standard deviation, or by a pair (min, max), where min represents the lower bound and max the upper bound for the target. Here we will use the former formulation (an example using the latter formulation can be found in Workshop 2):
<- list(
targets I25 = list(val = 115.88, sigma = 5.79),
I40 = list(val = 137.84, sigma = 6.89),
I100 = list(val = 26.34, sigma = 1.317),
I200 = list(val = 0.68, sigma = 0.034),
I300 = list(val = 29.55, sigma = 1.48),
I350 = list(val = 68.89, sigma = 3.44),
R25 = list(val = 125.12, sigma = 6.26),
R40 = list(val = 256.80, sigma = 12.84),
R100 = list(val = 538.99, sigma = 26.95),
R200 = list(val = 444.23, sigma = 22.21),
R300 = list(val = 371.08, sigma = 15.85),
R350 = list(val = 549.42, sigma = 27.47)
)
The ‘sigmas’ in our targets
list represent the uncertainty we have about the observations. Since our model is synthetic, we couldn’t rely on observations to define our targets. Instead, we chose the parameter set
<- list(b = 1/(76*365), mu = 1/(76*365), beta1 = 0.214,
chosen_params beta2 = 0.107, beta3 = 0.428, epsilon = 1/7,
alpha = 1/50, gamma = 1/14, omega = 1/365)
ran the model with it and used the relevant model outputs as the ‘val’ in targets
. The ‘sigma’ components were chosen to be \(5\%\) of the corresponding ‘val.’
Finally we need a set of wave0
data to start. When using your own model, you can create the dataframe ‘wave0’ following the steps below:
build a named dataframe
initial_points
containing a space filling set of points. A rule of thumb is to select at least \(10p\) parameter sets for the training set, where \(p\) is the number of parameters in the model. The dataframeinitial_points
can also contain parameter sets for the validation set, even though hmer can also work without such a set, using cross-validation. The columns ofinitial_points
should be named exactly as in the listranges
;run your model on the parameter sets in
initial_points
and define a dataframeinitial_results
in the following way: the nth row ofinitial_results
should contain the model outputs corresponding to the chosen targets for the parameter set in the nth row ofinitial_points
. The columns ofinitial_results
should be named exactly as in the listtargets
;bind
initial_points
andinitial_results
by their columns to form the dataframewave0
.
For this workshop, we created a helper function, space_filling_design
, that takes the number of parameter sets to generate, the number of parameters and the range of each parameter, and returns a space filling design with the required number of parameter sets. In this case, we request \(180\) parameter sets, \(90\) for the training set and \(90\) for the validation set.
<- space_filling_design(180, 9, ranges) initial_points
In initial_points
each row corresponds to a different parameter set and each column to a different parameter.
We then run the model for the parameter sets in initial_points
through the get_results
function. This is a helper function that acts as ode_results
, but has the additional feature of allowing us to decide which outputs and times should be returned: in our case we need the values of \(I\) and \(R\) at \(t=25, 40, 100, 200, 300, 350\).
<- setNames(data.frame(t(apply(initial_points, 1, get_results,
initial_results c(25, 40, 100, 200, 300, 350), c('I', 'R')))), names(targets))
Finally, we bind the parameter sets initial_points
to the model runs initial_results
and save everything in the data.frame wave0
:
<- cbind(initial_points, initial_results) wave0