2.4 Random Regression

Random regression (or a random intercepts and slopes model) essentially represents an interaction (or product) between predictors at different levels, with the random slopes being an unobserved, latent variable (\(u_2\)).

\[ y_{ij} = \beta_0 + \beta_1x_{i} + u_{1j} + u_{2j}x_{i} + \epsilon_{ij} \] \[ x_i \sim \mathcal{N}(0,\sigma^2_{x}) \] \[ \boldsymbol{u}_i \sim \mathcal{N}(0, \Sigma_u) \] \[ \epsilon \sim \mathcal{N}(0,\sigma^2_\epsilon) \]

We can specify random slopes by simulating a slope variable at the individual level (ind_slope - \(u_{2}\)). We can specify the mean environmental effect the slope of the environmental variable (\(beta_1\)). \(u_{2}\) then represents the deviations from the mean slope (this is typically how it is modelled in a linear mixed effect model). Importantly the beta parameter associated with ind_slope is specified as 0 (there is no ‘main effect’ of the slopes, just the interaction), and the beta parameter associated with interaction is 1.

squid_data <- simulate_population(
  data_structure=make_structure("individual(300)",repeat_obs=10),
  parameters = list(
    individual = list(
      names = c("ind_int","ind_slope"), 
      beta = c(1,0),
      vcov = c(1,0.5)
    ),
    observation= list(
      names = c("environment"),
      beta = c(0.2)
    ), 
    residual = list(
      vcov = c(0.5)
    ),
    interactions = list(
      names = c("ind_slope:environment"),
      beta = c(1)
    )
  )
)

data <- get_population_data(squid_data)

short_summary(lmer(y ~ environment + (1+environment|individual),data))
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ environment + (1 + environment | individual)
##    Data: data
## 
## REML criterion at convergence: 8075.9
## 
## Random effects:
##  Groups     Name        Variance Cov 
##  individual (Intercept) 0.9964       
##             environment 0.5520   0.04
##  Residual               0.5060       
## Number of obs: 3000, groups:  individual, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) -0.06715    0.05922  -1.134
## environment  0.24393    0.04546   5.366

We can make the link between the code and the equation more explicit, by expanding out the equation:

\[ y_{ij} = \beta_0 + \beta_xx_{i} + \boldsymbol{u}_j \boldsymbol{\beta}_u + \beta_{ux}u_{2j}x_{i} + \epsilon_{ij} \]

\[ \color{CornflowerBlue}{\boldsymbol{\beta_u} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}} , \color{orange}{\beta_{ux}=1} \]

Here we have specified no correlation between intercepts and slopes. To simulate a covariance/correlation between intercepts and slopes, we can simply give the vcov argument a covariance matrix, instead of two variances:

squid_data <- simulate_population(
  data_structure=make_structure("individual(300)",repeat_obs=10),
  parameters = list(
    individual = list(
      names = c("ind_int","ind_slope"), 
      beta = c(1,0),
      vcov = matrix(c(1,0.3,0.3,0.5),ncol=2,nrow=2,byrow=TRUE)
    ),
    observation= list(
      names = c("environment"),
      beta = c(0.2)
    ), 
    residual = list(
      vcov = c(0.5)
    ),
    interactions = list(
      names = c("ind_slope:environment"),
      beta = c(1)
    )
  )
)

data <- get_population_data(squid_data)

short_summary(lmer(y ~ environment + (1+environment|individual),data))
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ environment + (1 + environment | individual)
##    Data: data
## 
## REML criterion at convergence: 8122.5
## 
## Random effects:
##  Groups     Name        Variance Cov 
##  individual (Intercept) 1.1279       
##             environment 0.4933   0.33
##  Residual               0.5244       
## Number of obs: 3000, groups:  individual, 300
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.009427   0.062869   0.150
## environment 0.266923   0.043233   6.174