Dr. Arthur Gilmour

24 May 2022ASReml caters for a wide range of linear mixed models. Some models are quite complex involving hundreds of diverse variance components, each model has its own characteristics and some involve special ‘tricks’ in their specification. In addition, some datasets are quite large with millions of fixed and random effects, requiring the use of efficient computational tools, such as those exploiting sparsity. In the following sections, I describe some of these challenges and provide some details to consider.

Gilmour (2019, DOI: 10.1111/jbg.12398) describes the process of fitting linear mixed models as implemented in ASReml. That paper highlights the principle of minimising the computation by the use of **sparsity**, an important requirement for fast processing. A second requirement is the **efficiency** of numerical calculations and this has been significantly enhanced in ASReml 4.2 (build ng, May 2022) in particular by rearranging the computation. Consequently, this build of stand-alone ASReml is up to 100 times faster than ASReml 4.1 (available in 2019).

There are some widely used, well founded mixed models such as the *animal model*, the *sire model*, *maternal model*, *spatial analysis* as well as basic regression, factorial and (in)complete block analyses. These become the building blocks for multi-trait and multi-environment analyses. Each data set has its own characteristics and these should always be recognised. For example, litters in extensively grazed sheep are not biologically equivalent to litters in intensively housed pigs! Potential issues include the distribution of the data (binary, normal, skewed), variance heterogeneity, outliers and missing values. In addition, there is the issue of the purpose of the analysis: to select individuals for breeding, test a treatment hypothesis, estimate heritability, understand genetic and phenotypic correlations. All of these affect the type and structure of the model fitted.

It is important to perform univariate/single site analyses first before considering multi-variate/multi-environment analyses since the latter will fail to estimate covariance if the former fail to establish there is variance. In a recent case a user wanted to fit a trivariate animal model with repeated records. The trivariate analysis failed. The model involved additive genetic effects (animal effects estimated with a numerator relationship matrix), permanent environment (PE) effects (uncorrelated animal effects), fixed contemporary group effects and correlated residuals. However, for the second trait, there was no PE variance and for the third trait, there were no repeat observations although the data was set up with the animal effect repeated. Consequently, with no ‘residual’ (sampling) variance, univariate analysis of the third trait failed. After modifying the data file so that the third trait only appeared once for each animal, the trivariate analysis was performed with 3 variances and covariances at the genetic level, 2 variances and a covariance at the animal level (1 and 3) and 2 variances and a covariance at the sampling level (1 and 2).

Many datasets are quite big and consequently models become quite large. The issue then becomes one of sparsity. With field trials, a spatial analysis based on an autoregressive correlation of plots within rows and within columns has become common because it accommodates most field correlation patterns well and is sparse in that only immediate neighbours are connected in the inverse residual variance matrix. Similarly, very large data sets can be analysed using the numerator relationship matrix to define genetic links because the inverse is sparse with links between parents and their offspring and between parents with common offspring. This sparsity is exploited successfully by ASReml.

Similarly, if we want a covariance matrix across many traits/environments, it is likely to be over parameterised. The *Factor Analytic* variance structure allows the matrix to be estimated with reduced parameterization (in the spirit of principal components) and has been formulated to fit more sparsely as loadings plus specific variances. This formulation also helps avoid getting negative definite covariance matrices.

In specifying a model in ASReml, the terms are organised in 3 groups. The first group is for small fixed factors for which the order of fitting is important because of the desire to perform significance testing using Wald F statistics. The equations for these are likely to be relatively dense, especially after adjusting for the other terms in the model, but there are typically less than 1000 of them. The second group is for random factors; a variance structure is declared for each. The third group is for large fixed effects (no variance structure). After forming the mixed model equations, ASReml reorders the equations in the last 2 groups to retain as much sparsity as possible. More sparsity means less computation (fewer elements of the sparse inverse to compute). Generally, this reordering has no impact on the answers (effects estimated) unless the fixed model is over-parameterised in which case different equations may be declared singular.

More recent developments in ASReml include storing and processing the sparse matrices more efficiently and changing the order of computation to reduce reading and writing to memory. It can also successfully access more RAM, allowing even larger models to be fitted.

ASReml is designed to cater for a wide range of linear mixed models, being extensively used in plant and animal breeding, forestry and for analysis of human epidemiological data. That breadth of coverage is sometimes daunting to new users as most options are only relevant to particular situations. VSNi is committed to helping new and experienced users identify the most appropriate model for their data although VSNi does not provide an analysis service as such; but the developers of ASReml have broad experience as biometricians/statisticians.

The linear mixed model, as implemented in ASReml, has allowed immense advances in global food productivity and health (plant, animal and human) via quantitative and marker-based genetics over the last 25 years.

Arthur Gilmour obtained his BSc Agr from Sydney University majoring in Biometry in 1970 with a NSW Government traineeship. He served as a biometrician until his retirement from NSW Agriculture as a Principal Research Scientist in 2009. The generalized linear mixed model has been the major research interest of Dr Gilmour, motivated by problems arising in research data generated by agricultural scientists. From the outset, he was involved in software development to meet the current statistical analysis needs of his clients and colleagues. He obtained his PhD in animal breeding from Massey University in 1983 during which time he came into contact with Robin Thompson. This led to an ongoing collaboration also with Brian Cullis, resulting in the development of ASReml in 1996.

Related Reads

Dr. Vanessa Cave

31 May 2022Further inference from an ANOVA table: residual variance to standard errors and confidence intervals

Below is an example of a 2-way Analysis of Variance (ANOVA) for a randomised complete block design. From the ANOVA table, we’re going to see how to calculate:

- The standard error of a mean (SEM).
- The confidence interval (CI) for a mean.
- The standard error of the difference between two means (SED).
- The least significant difference between two means (LSD).
- The confidence interval for the difference between two means.

The ANOVA table gives us an estimate of the * residual mean square* () – also known as the mean square error, residual error or residual variation. This is the variation in the data unexplained by the ANOVA model. In the example above, this is the variation remaining after the block effects (block) and treatment effects (Nitrogen, Sulphur and the Nitrogen by Sulphur interaction, Nitrogen.Sulphur) have been accounted for.

The * standard error of a mean* (SEM) is calculated using the following formula:

where *n* is the number of replicates (or sample size).

The SEM describes the uncertainty in our estimate of the population mean for a treatment from the available data. The bigger the sample size (i.e., the larger *n*), the smaller the SEM and the more precise our estimate is.

In our example, there are 12 unique treatment combinations: the 3 levels of Nitrogen by the 4 levels of Sulphur. Note, we can obtain the number of levels of each treatment, and number of blocks, from the degrees of freedom in the ANOVA table.

Furthermore, the total number of degrees of freedom + 1, gives us the number of experimental units. Thus, in our example, there are 36 experimental units, with each of the 12 unique treatment combinations occurring exactly once in each of the 3 blocks.

Therefore…

- The 12 Nitrogen by Sulphur means have replication of 3 (1 replicate per block).

For example, the 3 replicates of the treatment corresponding to the first level of Nitrogen and the second level of Sulphur (Nitrogen 1 Sulphur 2) are highlighted yellow in the schematic below of our randomised complete block design:

Thus, the standard error for the Nitrogen by Sulphur means is:

- The 3 Nitrogen means, pooled over the four Sulphur levels, have replication of 12 (3 blocks x 4 levels of Sulphur).

For example, the replicates of the first level of Nitrogen (Nitrogen 1) are highlighted in yellow:

(Note, within each of the 3 blocks, a given level of Nitrogen corresponds to 4 unique treatment combinations: 1 at each level of Sulphur).

Thus, the standard error for the Nitrogen means is:

- The 4 Sulphur means, pooled over the 3 Nitrogen levels, have replication of 9 (3 blocks x 3 levels of Nitrogen).

For example, the replicates of the second level of Sulphur (Sulphur 2) are highlighted yellow:

Thus, the standard error for the Sulphur means is:

The * confidence interval (CI) for a mean* is

x

where is the critical value of the distribution with degrees of freedom. For a confidence interval of C%, . For example, for a 95% confidence interval, . The refers to the residual degrees of freedom. This can be read directly from the ANOVA table.

In a nutshell, a C% confidence interval for a mean is a range of values that you can be C% certain contains the true population mean. Although, strictly speaking, the confidence level C% represents a long-run percentage: the C% confidence interval gives an estimated range of values that we would expect the true, but unknown, population parameter to lie within C% of the times, should we repeat our experiment a large number of times.

The tables of means for our example are given below:

For example, the 95% confidence interval for the overall mean for Sulphur level 4 is:

x

x

Similarly, the 99% confidence interval for the Nitrogen level 1, Sulphur level 3 mean is:

x

x

The * standard error of the difference between two means* (SED) is calculated using the following formula:

*Note*: The formula is different when the sample sizes of the means being compared are unequal.

The SED describes the uncertainty in our estimate of the difference between two population means.

For our example, the SED between …

a) two Nitrogen by Sulphur means is:

b) two overall Nitrogen means is:

and

c) two overall Sulphur means is:

Two means can be compared using their * least significant difference* (LSD). The LSD gives the smallest value in which the absolute difference between the two means is deemed to be is statistically significant at the α level of significance. The LSD is given by:

( x 100%) x

Thus, for our example, the LSD(5%) for comparing …

a) two Nitrogen by Sulphur means is:

% = x

= x

=

b) two overall Nitrogen means is:

% = x

= x

=

c) two overall Sulphur means is:

% = x

= x

=

Using the overall means for Nitrogen as an example, at the 5% significance level…

- there is statistical evidence that the Nitrogen 1 and Nitrogen 2 means differ

, the LSD(5%)

- there is statistical evidence that the Nitrogen 1 and Nitrogen 3 means differ

, the LSD(5%)

- there is NO statistical evidence that the Nitrogen 2 and Nitrogen 3 means differ

, the LSD(5%)

The difference between two means can be also compared, and more fully described, using a * confidence interval for the difference between two means*. This is given by:

x

or, equivalently,

x x 100%

where and are the two means being compared.

Once again, using the overall means for Nitrogen as an example, the 95% confidence interval for the difference between:

- the Nitrogen 1 and Nitrogen 2 means is:

- the Nitrogen 1 and Nitrogen 3 means is:

- the Nitrogen 2 and Nitrogen 3 means is:

Notice that the CIs comparing Nitrogen 1 with Nitrogen 2, and Nitrogen 1 with Nitrogen 3, both exclude zero. Hence, we can conclude, at the 5% significance level, that the mean for Nitrogen 1 is significant differently from both the Nitrogen 2 and Nitrogen 3 means. (In this case the mean for Nitrogen 1 is lower than that of Nitrogen 2 and Nitrogen 3). Conversely, as the CI comparing Nitrogen 2 and Nitrogen 3 includes zero, we conclude that there is no evidence of a difference between these two means (at the 5% significance level).

Luckily for us, we rarely need to calculate these quantities ourselves, as they are generated by most statistical software packages. However, it is useful to understand how they are calculated and how they are related. For example, in order to scrutinize reported results, or to calculate, at a later date, a quantity that you’ve forgotten to generate.

Genstat has a very powerful set of ANOVA tools, that are straightforward and easy to use. In addition to the ANOVA table, you can readily output the treatment means, SEMs, SEDs, LSDs and CIs.

Dr. Vanessa Cave is an applied statistician interested in the application of statistics to the biosciences, in particular agriculture and ecology, and is a developer of the Genstat statistical software package. She has over 15 years of experience collaborating with scientists, using statistics to solve real-world problems. Vanessa provides expertise on experiment and survey design, data collection and management, statistical analysis, and the interpretation of statistical findings. Her interests include statistical consultancy, mixed models, multivariate methods, statistical ecology, statistical graphics and data visualisation, and the statistical challenges related to digital agriculture.

Vanessa is currently President of the Australasian Region of the International Biometric Society, past-President of the New Zealand Statistical Association, an Associate Editor for the Agronomy Journal, on the Editorial Board of The New Zealand Veterinary Journal and an honorary academic at the University of Auckland. She has a PhD in statistics from the University of St Andrew.

Dr. Salvador A. Gezan

09 March 2022Meta analysis using linear mixed models

Meta-analysis is a statistical tool that allows us to combine information from related, but independent studies, that all have as an objective to estimate or compare the same effects from contrasting treatments. Meta-analysis is widely used in many research areas where an extensive literature review is performed to identify studies that had a similar research question. These are later combined using meta-analysis to estimate a single combined effect. Meta-analyses are commonly used to answer healthcare and medical questions, where they are widely accepted, but they also are used in many other scientific fields.

By combining several sources of information, meta-analyses have the advantage of greater statistical power, therefore increasing our chance of detecting a significant difference. They also allow us to assess the variability between studies, and help us to understand potential differences between the outcomes of the original studies.

The underlying premise in meta-analysis is that we are collecting information from a group of, say * n*, studies that individually estimated a parameter of interest, say

In meta-analysis, the target population parameter * θ* can correspond to any of several statistics, such as a treatment mean, a difference between treatments; or more commonly in clinical trials, the log-odds ratio or relative risk.

There are two models that are commonly used to perform meta-analyses: the fixed-effect model and the random-effects model. For the fixed-effect model, it is assumed that there is only a single unique true effect our single * θ* above, which is estimated from a random sample of studies. That is, the fixed-effect model assumes that there is a single population effect, and the deviations obtained from the different studies are only due to sampling error or random noise. The linear model (LM) used to describe this process can be written as:

where * is each individual observed response, ** is the population parameter (also often known as ** μ*, the overall mean), and

For the random-effects model we still assume that there is a common true effect between studies, but in addition, we allow this effect to vary between studies. Variation between these effects is a reasonable assumption as no two studies are identical, differing in many aspects; for example, different demographics in the data, slightly differing measurement protocols, etc. Because, we have a random sample of studies, then we have a random sample of effects, and therefore, we define a linear mixed model (LMM) using the following expression:

where, as before, * is each individual observed response, ** is the study-specific population parameter, with the assumption of ** and ** is a random error or residual with the same normality assumptions as before. Alternatively, the above LMM can be written as:*

where * and ** is a random deviation from the overall effect mean ** θ* with assumptions

This is a LMM because we have, besides the residual, an additional random component that has a variance component associated to it, that is * or ***.** This variance is a measurement of the variability ‘between’ studies, and it will reflect the level of uncertainty of observing a specific *. These LMMs can be fitted, and variance components estimated, under many linear mixed model routines, such as ***nlme** in R, **proc mixed** in SAS, Genstat or ASReml.

Both fixed-effect and random-effects models are often estimated using summary information, instead of the raw data collected from the original study. This summary information corresponds to estimated mean effects together with their variances (or standard deviations) and the number of samples or experimental units considered per treatment. Since the different studies provide different amounts of information, weights should be used when fitting LM or LMM to summary information in a meta-analysis, similar to weighted linear regression. In meta-analysis, each study has a different level of importance, due to, for example, differing number of experimental units, slightly different methodologies, or different underlying variability due to inherent differences between the studies. The use of weights allows us to control the influence of each observation in the meta-analysis resulting in more accurate final estimates.

Different statistical software will manage these weights slightly differently, but most packages will consider the following general expression of weights:

where * is the weight and ** is the variance of the observation. For example, if the response corresponds to an estimated treatment mean, then its variance is **, with ** MSE* being the mean square error reported for the given study, and

Therefore, after we collect the summary data, we fit our linear or linear mixed model with weights and request from its output an estimation of its parameters and their standard errors. This will allow us to make inference, and construct, for example, a 95% confidence interval around an estimate to evaluate if this parameter/effect is significantly different from zero. This will be demonstrated in the example below.

The dataset we will use to illustrate meta-analyses was presented and previously analysed by Normand (1999). The dataset contains infromation from nine independent studies where the length of hospitalisation (measured in days) was recorded for stroke patients under two different treatment regimes. The main objective was to evaluate if specialist inpatient stroke care (**sc**) resulted in shorter stays when compared to the conventional non-specialist (or routine management) care (**rm**).

The complete dataset is presented below, and it can also be found in the file STROKE.txt. In this table, the columns present for each study are the sample size (**n.sc** and **n.rm**), their estimated mean value (**mean.sc** and **mean.rm**) together with their standard deviation (**sd.sc** and **sd.rm**) for both the specialist care and routine management care, respectively.

We will use the statistical package R to read and manipulate the data, and then the library ASReml-R (Butler *et al*. 2017) to fit the models.

First, we read the data in R and make some additional calculations, as shown in the code below:

`STROKE <- read.table("STROKE.TXT", header=TRUE)` |

`STROKE$diff <- STROKE$mean.sc - STROKE$mean.rm ` `STROKE$Vdiff <- (STROKE$sd.sc^2/STROKE$n.sc) + (STROKE$sd.rm^2/STROKE$n.rm) ` `STROKE$WT <- 1/(STROKE$Vdiff) ` |

The new column **diff** contains the difference between treatment means (as reported from each study). We have estimated the variance of this mean difference, **Vdiff**, by taking from each treatment its individual **MSE** (mean square error) and dividing it by the sample size, and then summing the terms of both treatments. This estimate assumes, that for a given study, the samples from both treatments are independent, and for this reason we did not include a covariance. Finally, we have calculated a weight (**WT**) for each study as the inverse of the variance of the mean difference (*i.e.*, **1/Vdiff**).

We can take another look at this data with these extra columns:

The above table shows a wide range of values between the studies in the mean difference of length of stay between the two treatments, ranging from as low as **−71.0** to **11.0**, with a raw average of **−15.9**. Also, the variances of these differences vary considerably, which is also reflected in their weights.

The code to fit the fixed-effect linear model using ASReml-R is shown below:

`library(asreml) ` `meta_f<-asreml(fixed=diff~1, ` ` weights=WT, ` ` family=asr_gaussian(dispersion=1), ` ` data=STROKE)` |

In the above model, our response variable is **diff**, and the weights are indicated by the variate **WT**. As the precisions are contained within the weights the command **family** is required to fix the residual error (**MSE**) to exactly **1.0**, hence, it will not be estimated.

The model generates output that can be used for inference. We will start by exploring our target parameter, *i.e.* * θ*, by looking at the estimated fixed effect mean and its standard error. This is done with the code:

`meta_effect <- summary(meta_f, coef=TRUE)$coef.fixed` |

Resulting in the output:

The estimate of * θ* is equal to

`wald.asreml(meta_f)` |

Note that this is approximated as, given that weights are considered to be known, the degrees of freedom are assumed to be infinite; hence, this will be a liberal estimate.

The results from this ANOVA table indicate a high significance of this parameter (* θ*) with an approximated p-value of <

However, as indicated earlier, a random-effects model might seem more reasonable given the inherent differences in the studies under consideration. Here, we extend the model to include the random effect of study. In order to do this, first we need to ensure that this is treated as a factor in the model by running the code:

`STROKE$study <- as.factor(STROKE$study)_f)` |

The LMM to be fitted using ASReml-R is:

`meta_r<-asreml(fixed=diff~1, ` ` random=~study, ` ` weights=WT, ` ` family=asr_gaussian(dispersion=1), ` ` data=STROKE)` |

Note in this example the only difference from the previous code is the inclusion of the line **random=~study**. This includes the factor study as a random effect. An important result from running are the variance component estimates. These are obtained with the command:

`summary(meta_r)$varcomp` |

In this example, the variance associated with the differences in the target parameter (* θ*) between the studies is

We can output the fixed and random effects using the following commands:

`meta_effect <- summary(meta_r, coef=TRUE)$coef.fixed ` `BLUP <- summary(meta_r, coef=TRUE)$coef.random` |

Note that now that our estimated mean difference corresponds to **−15.106** days with an standard error of **8.943**, and that the approximate 95% confidence interval [**−32.634;2.423**] now contains zero. An approximated ANOVA based on the following code:

`wald.asreml(meta_r)` |

results in the output:

We have a p-value of **0.0912**, indicating that there is no significant difference in length of stay between the treatments evaluated. Note that the estimates of the random effects of study, also known as BLUPs (best linear unbiased predictions) are large, ranging from **−45.8** to **22.9**, and widely variable. The lack of significance in the random-effects model, when there is a difference of **−15.11** days, is mostly due to the large variability of **684.62** found between the different studies, resulting in a substantial standard error for the estimated mean difference.

In the following graph we can observe the 95% confidence intervals for each of the nine studies together with the final parameter estimated under the *Random-effects* Model. Some of these confidence intervals contain the value zero, including the one for the random-effects model. However, it can be observed that the confidence interval from the random-effects model is an adequate summarization of the nine studies, representing a compromising confidence interval.

An important aspect to consider is the difference in results between the fixed-effect and the random-effects model that are associated, as indicated earlier, with different inferential approaches. One way to understand this is by considering what will happen if a new random study is included. Because we have a large variability in the study effects (as denoted by ), we expect this new study to have a difference between treatments that is randomly within this wide range. This, in turn, is expressed by the large standard error of the fixed effect * θ*, and by its large 95% confidence interval that will ensure that for ‘any’ observation we cover the parameter estimate 95% of the time. Therefore, as shown by the data, it seems more reasonable to consider the random-effects model than the fixed-effect model as it is an inferential approach that deals with several sources of variation.

In summary, we have used the random-effects model to perform meta-analysis on a medical research question of treatment differences by combining nine independent studies. Under this approach we assumed that all studies describe the same effect but we allowed for the model to express different effect sizes through the inclusion of a random effect that will vary from study to study. The main aim of this analysis was not to explain why these differences occur, here, our aim was to incorporate a measure of this uncertainty on the estimation of the final effect of treatment differences.

There are several extensions to meta-analysis with different types of responses and effects. Some of the relevant literature recommended to the interested reader are van Houwelingen *et al*. (2002) and Vesterinen et al. (2014). Also, a clear presentation with further details of the differences between fixed-effect and random-effects models is presented by Borenstein *et al*. (2010).

Dataset: STROKE.txt

R code: STROKE_METAA.R

Borenstein, M; Hedges, LV; Higgins, JPT; Rothstein, HR. 2010. *A basic introduction to fixed-effect and random-effects models for meta-analysis.* Research Synthesis Methods 1: 97-111.

Butler, DG; Cullis, BR; Gilmour, AR; Gogel, BG; Thompson, R. 2017. *ASReml-R Reference Manual Version 4.* VSNi International Ltd, Hemel Hempstead, HP2 14TP, UK.

Normand, ST. 1999. Meta-analysis: *Formulating, evaluating, combining, and reporting. Statistics in Medicine 18: 321-359.*

van Houwelingen, HC; Arends, LR; Stignen, T. 2002. *Advanced methods in meta-analysis: multivariate approach and meta-regression.* Statistics in Medicine 21: 589-624.

Vesterinen, HM; Sena, ES; Egan, KJ; Hirst, TC; Churolov, L; Currie, GL; Antonic, A; Howells, DW; Macleod, MR. 2014. *Meta-analysis of data from animal studies: a practical guide*. Journal of Neuroscience Methods 221: 92-102.

Salvador Gezan is a statistician/quantitative geneticist with more than 20 years’ experience in breeding, statistical analysis and genetic improvement consulting. He currently works as a Statistical Consultant at VSN International, UK. Dr. Gezan started his career at Rothamsted Research as a biometrician, where he worked with Genstat and ASReml statistical software. Over the last 15 years he has taught ASReml workshops for companies and university researchers around the world.

Dr. Gezan has worked on agronomy, aquaculture, forestry, entomology, medical, biological modelling, and with many commercial breeding programs, applying traditional and molecular statistical tools. His research has led to more than 100 peer reviewed publications, and he is one of the co-authors of the textbook “Statistical Methods in Biology: Design and Analysis of Experiments and Regression”.

Copyright © 2000-2022 VSN International Ltd. | Privacy Policy | EULA | Terms & Conditions | Quality Policy | Sitemap