1.7 Model equations
In all the examples so far, the predictors are simulated, multiplied by their respective beta value, and then added together. When simulating more complex models, we may want to prevent some of this behaviour or add in additional parameters, interactions or general complexity.
To introduce this increased complexity, we can specify a model formula. This explicitly tells simulate_population()
how to put the simulated predictors together to form the response variable. We can first demonstrate this with a simple linear model.
<- simulate_population(
squid_data parameters=list(
observation= list(
names = c("temperature", "rainfall"),
beta =c(0.5,0.3)
),residual = list(
names="residual",
vcov=1
)
),n=2000,
model = "y = temperature + rainfall + residual"
)
<- get_population_data(squid_data)
data
coef(lm(y ~ temperature + rainfall, data))
## (Intercept) temperature rainfall
## 0.004605396 0.506914670 0.288952532
In the formula, we write out how the variables are added up. Everything that you want exported needs to be defined and named (e.g. y=...
). By default, all predictors are multiplied by their respective beta values before this happens. Sometimes it is useful to prevent this multiplication (e.g. multiply two traits together without them being multiplied by their respective beta). We can do this by using I()
.
<- simulate_population(
squid_data parameters=list(
observation= list(
names = c("temperature", "rainfall"),
beta =c(0.5,0.3)
),residual = list(
names="residual",
vcov=1
)
),n=2000,
model = "y = temperature + I(rainfall) + residual"
)
<- get_population_data(squid_data)
data
coef(lm(y ~ temperature + rainfall, data))
## (Intercept) temperature rainfall
## -0.004985291 0.495841550 0.960772034
We can also add extra parameters to the parameter list, which we can call from within the function. In combination with I()
we can then customise the model formula a bit
<- simulate_population(
squid_data parameters=list(
observation= list(
names = c("temperature", "rainfall"),
beta =c(0.5,0.3),
extra_beta = 0.1
),residual = list(
names="residual",
vcov=1
)
),n=2000,
model = "y = temperature + extra_beta*I(rainfall) + residual"
)
<- get_population_data(squid_data)
data
coef(lm(y ~ temperature + rainfall, data))
## (Intercept) temperature rainfall
## -0.000221756 0.481460783 0.085752465
Finally, we can use []
to index the levels of the random effects within the formula.. An example of this is given Section 4.4, along with use of the index_link
argument.