4.1 Additive genetics effects
In order to simulate breeding values (additive genetic effects), we can provide the simulate_population()
function with the relatedness structure in the population. The simplest way to do this is providing a pedigree using the the pedigree
argument (a genetic relatedness matrix could also be given to the cov_str
argument). The input to this argument needs to be a list, and the name of the pedigree in the list links it with the item in the parameter list.
NOTE the simulate_population
function has very little error checking of pedigree structure at the moment
When simulating breeding values, all individuals in pedigree need to be in the data_structure and vice versa. Having unsampled individuals (for example the base population) can be achieved in the sampling stage (not implemented yet).
Lets start by importing a pedigree from the pedtricks package
library(squidSim)
library(nadiv)
library(gremlin)
library(pedtricks)
data(gryphons)
<- fix_ped(gryphons[,1:3])
ped head(ped)
## id dam sire
## 1 204 <NA> <NA>
## 2 205 <NA> <NA>
## 3 206 <NA> <NA>
## 4 207 <NA> <NA>
## 5 208 <NA> <NA>
## 6 209 <NA> <NA>
names(ped)[1]<-"animal"
We can use this pedigree as a data_structure
<- simulate_population(
squid_data data_structure = ped[,1:3],
pedigree = list(animal=ped[,1:3]),
parameters =list(
animal = list(
vcov = 0.2
),residual = list(
vcov = 0.5
)
)
)
<- get_population_data(squid_data)
data head(data)
## y animal_effect residual animal dam sire squid_pop
## 1 -2.3553680 -0.2981992 -2.0571688 204 <NA> <NA> 1
## 2 -0.0831304 0.2218079 -0.3049383 205 <NA> <NA> 1
## 3 -0.1690798 -0.2820988 0.1130190 206 <NA> <NA> 1
## 4 1.6868459 0.6530291 1.0338168 207 <NA> <NA> 1
## 5 -0.4311466 -0.5338738 0.1027271 208 <NA> <NA> 1
## 6 -0.1819541 -0.4747195 0.2927654 209 <NA> <NA> 1
We can use the grelim pacakge to run a REML animal model
<- makeAinv(ped)$Ainv
Ainv
<- gremlin(y~1, random=~ animal,data=data,ginverse=list(animal=Ainv)) mod
## gremlin started: 18:14:56
## 'as(<dsyMatrix>, "dsCMatrix")' is deprecated.
## Use 'as(., "CsparseMatrix")' instead.
## See help("Deprecated") and help("Matrix-deprecated").
## 1 of max 20 lL:-7685.481081 took 0.0037 sec.
## 2 of max 20 lL:-1671.313492 took 0.0026 sec.
## 3 of max 20 lL:-1604.018169 took 0.0026 sec.
## 4 of max 20 lL:-1601.696981 took 0.0026 sec.
## 5 of max 20 lL:-1601.692845 took 0.0026 sec.
## 6 of max 20 lL:-1601.692845 took 0.0026 sec.
## 7 of max 20 lL:-1601.692845 took 0.0026 sec.
##
## *** REML converged ***
##
## gremlin ended: 18:14:56
summary(mod)
##
## Linear mixed model fit by REML [' gremlin ']
## REML log-likelihood: -1601.693
## lambda: FALSE
##
## elapsed time for model: 0.0564
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.05039 -0.54998 0.00371 0.54487 3.14710
##
## (co)variance parameters: ~animal
## rcov: ~units
## Estimate Std. Error
## G.animal 0.2854 0.02908
## ResVar1 0.4477 0.02554
##
## (co)variance parameter sampling correlations:
## G.animal ResVar1
## G.animal 1.0000 -0.8469
## ResVar1 -0.8469 1.0000
##
## Fixed effects: y ~ 1
## Estimate Std. Error z value
## (Intercept) -0.02358 0.01526 -1.545
We don’t have to simulate a phenotype for everyone in the pedigree, so can include a subset of IDs in the data strcuture
<- data.frame(animal =ped[sample(1:nrow(ped),1000,replace=FALSE),1])
ds
<- simulate_population(
squid_data data_structure = ds,
pedigree = list(animal=ped[,1:3]),
parameters =list(
animal = list(
vcov = 0.2
),residual = list(
vcov = 0.5
)
)
)
<- get_population_data(squid_data)
data head(data)
## y animal_effect residual animal squid_pop
## 1 -0.46968357 -0.17338301 -0.29630056 3380 1
## 2 -0.08352477 -0.17213302 0.08860825 362 1
## 3 -1.48977040 -0.85945395 -0.63031645 1918 1
## 4 -0.20485162 -0.50879642 0.30394481 2708 1
## 5 0.39879740 -0.08200417 0.48080157 4287 1
## 6 -0.59120933 -0.19983313 -0.39137620 3418 1
<- makeAinv(ped)$Ainv
Ainv
<- gremlin(y~1, random=~ animal,data=data,ginverse=list(animal=Ainv)) mod
## gremlin started: 11:24:44
## 'as(<dsyMatrix>, "dsCMatrix")' is deprecated.
## Use 'as(., "CsparseMatrix")' instead.
## See help("Deprecated") and help("Matrix-deprecated").
## 1 of max 20 lL:-1551.356413 took 0.0034 sec.
## 2 of max 20 lL:-418.535368 took 0.0023 sec.
## 3 of max 20 lL:-365.175306 took 0.0022 sec.
## 4 of max 20 lL:-359.174824 took 0.0022 sec.
## 5 of max 20 lL:-359.005605 took 0.0022 sec.
## 6 of max 20 lL:-359.005415 took 0.0022 sec.
## 7 of max 20 lL:-359.005415 took 0.0022 sec.
##
## *** REML converged ***
##
## gremlin ended: 11:24:44
summary(mod)
##
## Linear mixed model fit by REML [' gremlin ']
## REML log-likelihood: -359.0054
## lambda: FALSE
##
## elapsed time for model: 0.0457
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.43366 -0.45542 0.02354 0.43253 2.86515
##
## (co)variance parameters: ~animal
## rcov: ~units
## Estimate Std. Error
## G.animal 0.4207 0.09582
## ResVar1 0.3480 0.08755
##
## (co)variance parameter sampling correlations:
## G.animal ResVar1
## G.animal 1.0000 -0.9299
## ResVar1 -0.9299 1.0000
##
## Fixed effects: y ~ 1
## Estimate Std. Error z value
## (Intercept) -0.04197 0.03176 -1.321
We might want to simulate repeated measurements to allow estimation of permanent environment effects. The simplest way to do this is to create a duplicated column in the data structure of the individual IDs. Permanent environment effects that are not linked to the pedigree can then be simulated.
NOTICE!
The instructions given for simulating permanent environment effects using squidSim were incorrect in the vignette prior to version 0.2.0 (updated in September 2025).
## make data structure with two observations per individual, with ID duplicated in two columns, animal and individual, and link the animal column in the data structure to the pedigree
<- data.frame(animal=rep(ped[,1], 2),individual=rep(ped[,1], 2))
ds
<- simulate_population(
squid_data data_structure = ds,
pedigree=list(animal=ped),
parameters = list(
animal = list(
vcov = 0.2
),individual = list(
vcov = 0.3
),residual = list(
vcov = 0.5
)
)
)
<- get_population_data(squid_data)
data head(data)
## y animal_effect individual_effect residual animal individual
## 1 -0.7812849 0.4665620 -1.0389875 -0.2088594 204 204
## 2 -0.3499417 -0.9695562 0.2860528 0.3335617 205 205
## 3 -0.7490084 -0.5394729 0.2739732 -0.4835087 206 206
## 4 -1.1704516 -0.2701808 0.1314451 -1.0317159 207 207
## 5 0.2931743 0.1025247 -0.6129429 0.8035925 208 208
## 6 -0.1245506 0.7061392 -0.1828131 -0.6478767 209 209
## squid_pop
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
<- makeAinv(ped)$Ainv
Ainv
<- gremlin(y~1, random=~ animal + individual,data=data,ginverse=list(animal=Ainv)) mod_pe
## gremlin started: 11:24:44
## 1 of max 20 lL:-12123.358564 took 0.0051 sec.
## 2 of max 20 lL:-4380.793795 took 0.0043 sec.
## 3 of max 20 lL:-4212.077157 took 0.0042 sec.
## 4 of max 20 lL:-4204.524508 took 0.0042 sec.
## 5 of max 20 lL:-4204.496825 took 0.0042 sec.
## 6 of max 20 lL:-4204.496825 took 0.0042 sec.
## 7 of max 20 lL:-4204.496825 took 0.0041 sec.
##
## *** REML converged ***
##
## gremlin ended: 11:24:44
summary(mod_pe)
##
## Linear mixed model fit by REML [' gremlin ']
## REML log-likelihood: -4204.497
## lambda: FALSE
##
## elapsed time for model: 0.0596
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.8558 -0.5524 -0.0036 0.5431 3.0406
##
## (co)variance parameters: ~animal + individual
## rcov: ~units
## Estimate Std. Error
## G.animal 0.2237 0.029221
## G.individual 0.3022 0.027781
## ResVar1 0.4947 0.009977
##
## (co)variance parameter sampling correlations:
## G.animal G.individual ResVar1
## G.animal 1.000e+00 -0.8268 -1.794e-15
## G.individual -8.268e-01 1.0000 -1.796e-01
## ResVar1 -1.977e-15 -0.1796 1.000e+00
##
## Fixed effects: y ~ 1
## Estimate Std. Error z value
## (Intercept) 0.002897 0.01537 0.1885