3.1 Predictors affecting multiple responses

If we look a little at what simulate_population() assumes underneath with the formulation above (just random effects), we can understand more how we simulate predictors that affect multiple response variable. In the above code, we are essentially simulating a predictor for each trait (individual_effect1 and individual_effect2) with some covariance between them.

We can expand the equation above:

\[\boldsymbol{y}_{ij} = \boldsymbol{\beta}_0 + \boldsymbol{u}_j B_u + \boldsymbol{\epsilon}_{ij}B_{\epsilon}\] \[ B_u = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} , B_{\epsilon} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \]

to include matrices of \(\beta\)s, \(B\), the columns of which refer to the response variable, and the rows predictors. In this case they are all identity matrices, which essentially controlling which predictors affects which response (\(u_1\) affects \(y_1\) but not \(y_2\) and vice versa). Internally, simulate_population() does the same thing, and assigns beta as an identity matrix.

squid_data <- simulate_population(
  data_structure= make_structure(structure = "individual(100)",repeat_obs=10),
  n_response=2,
  response_name = c("body_mass","behaviour"),
  parameters=list(
    individual = list(
      vcov =matrix(c(1,0.5,0.5,1),nrow=2,ncol=2,byrow=TRUE),
      beta= diag(2)
    ),
    residual = list(
      vcov =matrix(c(1,0.5,0.5,1),nrow=2,ncol=2,byrow=TRUE),
      beta= diag(2)
    )
  )
)

How is this helpful? Well now we could simulate one (or many) predictor(s) that both responses. In the form of an equation we could have

\[\boldsymbol{y}_{ij} = \boldsymbol{\beta}_0 + \boldsymbol{x}_{i} B_x + \boldsymbol{u}_j + \boldsymbol{\epsilon}_{ij}\]

\[\boldsymbol{x}_i \sim \mathcal{N}(\boldsymbol{\mu}_x, \Sigma_x)\]

\[\boldsymbol{u}_i \sim \mathcal{N}(0, \Sigma_u)\]

\[\boldsymbol{\epsilon}_{i} \sim \mathcal{N}(0, \Sigma_{\epsilon})\]

where \(B\) is a \(p*q\) matrix, where \(p\) is number of predictors. So if we returned to our original example in Section 1, we have three predictors at the observation level - temperature, rainfall and wind. So \(B_x\) would be a 3*2 matrix, with 3 predictors and two responses.

Beta <- matrix(c(
  0.5, -0.1,
  0.2, -0.2,
  0.3, -0.1
  ),nrow=3,ncol=2,byrow=TRUE)
Beta
##      [,1] [,2]
## [1,]  0.5 -0.1
## [2,]  0.2 -0.2
## [3,]  0.3 -0.1

So here, the environment variables all positively affect body mass (response 1) and negatively affect behaviour (response 2). This then slots easily into our code.

squid_data <- simulate_population(
  data_structure= make_structure(structure = "individual(100)",repeat_obs=20),
  n_response=2,
  parameters= list(
    individual = list(
      vcov =matrix(c(1,0.5,0.5,1),nrow=2,ncol=2,byrow=TRUE)
    ),
    observation = list(
      names = c("temperature", "rainfall", "wind"),
      beta= Beta
    ),
    residual = list(
      vcov= matrix(c(1,0.5,0.5,1),nrow=2,ncol=2,byrow=TRUE)
    )
  )
)

data <- get_population_data(squid_data)
head(data)
##           y1         y2 individual_effect1 individual_effect2 temperature
## 1  0.2981452 -1.3155845          -0.311252          -0.897568  -1.1453264
## 2 -0.6873794 -0.6087159          -0.311252          -0.897568  -0.8578652
## 3  0.5970893 -0.6079079          -0.311252          -0.897568  -0.2261574
## 4 -1.4274647 -1.3401883          -0.311252          -0.897568   1.8213830
## 5  1.5506350 -0.6644989          -0.311252          -0.897568   1.1993774
## 6 -0.3039953 -1.9789316          -0.311252          -0.897568  -1.2945405
##    rainfall       wind  residual1  residual2 individual squid_pop
## 1 0.5882671  1.1457965  0.7206679 -0.3003161          1         1
## 2 0.2144061 -0.5558841  0.1766892  0.1903584          1         1
## 3 1.6741909  0.5743441  0.5142786  0.6593169          1         1
## 4 0.2989377 -0.8591160 -1.8289570 -0.2866060          1         1
## 5 0.4251985 -0.2197198  1.2430745  0.4160745          1         1
## 6 1.1467957  1.0675309  0.1049086 -0.8747054          1         1
# library(MCMCglmm)
# mod <- MCMCglmm(cbind(y1,y2)~1,random=~us(trait):individual, rcov=~us(trait):units,data=data,family=rep("gaussian",2),verbose=FALSE)
# summary(mod)

Equally if we want an interaction, we now have to expand the size of what we give to beta, with one \(\beta\) for each response, in a matrix, with \(q\) columns.

squid_data <- simulate_population(
  data_structure= make_structure(structure = "individual(100)",repeat_obs=20),
  n_response=2,
  
  parameters= list(
    individual = list(
      vcov =matrix(c(1,0.5,0.5,1),nrow=2,ncol=2,byrow=TRUE)
    ),
    observation = list(
      names = c("temperature", "rainfall", "wind"),
      beta= Beta
    ),
    interactions = list(
      names = c("temperature:rainfall"),
      beta=matrix(c(0.1,-0.3),ncol=2,byrow=TRUE)
    ),
    residual = list(
      vcov= matrix(c(1,0.5,0.5,1),nrow=2,ncol=2,byrow=TRUE)
    )
  )
)

data <- get_population_data(squid_data)
head(data)
##           y1          y2 individual_effect1 individual_effect2 temperature
## 1 -1.4752163 -0.53377659         -0.2872822         -0.6655179  0.03214917
## 2  0.1289671 -0.83203811         -0.2872822         -0.6655179 -0.08320398
## 3  0.7591382 -0.04450337         -0.2872822         -0.6655179 -0.28697095
## 4  2.1132225 -1.23146725         -0.2872822         -0.6655179  1.92678408
## 5 -0.8876520 -1.06185793         -0.2872822         -0.6655179  0.40335533
## 6  0.5234630 -0.94372291         -0.2872822         -0.6655179  0.55847261
##     rainfall       wind  residual1   residual2 temperature:rainfall individual
## 1 -0.4787747 -0.4815074 -0.9622622 -0.01356718          -0.01539221          1
## 2  0.5679376  0.2375696  0.2777184 -0.05167259          -0.04725466          1
## 3 -0.4349687 -0.8358394  1.5151691  0.45918672           0.12482338          1
## 4 -0.1022663  0.1218843  1.4407051 -0.44064937          -0.19704515          1
## 5  0.9295396 -0.7546885 -0.7990423 -0.13308504           0.37493476          1
## 6 -1.2836502 -1.3774485  1.2731618 -0.83189771          -0.71688346          1
##   squid_pop
## 1         1
## 2         1
## 3         1
## 4         1
## 5         1
## 6         1
# library(MCMCglmm)
# mod <- MCMCglmm(cbind(y1,y2)~1,random=~us(trait):individual, rcov=~us(trait):units,data=data,family=rep("gaussian",2),verbose=FALSE)
# summary(mod)