What’s the difference between fixed and random effects meta-analysis?

If you’re familiar with the meta-analysis literature you will have heard of ‘fixed’ and ‘random’ effect meta-analysis (Borenstein et al., 2009; Gurevitch and Hedges, 1999; Gurevitch et al., 2018; Koricheva et al., 2013; Nakagawa and Santos, 2012; Nakagawa et al., 2017). Understanding the differences between these two models is fundamentally important to understanding meta-analysis more generally. Also, once you understand the difference you’ll realise why fixed-effect meta-analyses are not something we should normally apply in ecology and evolution. The assumptions are not too sensible when we are talking about highly variable meta-analytic data taken from many, many species. In fact, even a random effects meta-analysis is not going to cut it in most cases (more on that later).

Despite the limitations, these models have an important place in the meta-analytic literature. They are also excellent stepping stones to understanding the assumptions inherent to meta-analytic models and how we can build these models more appropriately for our kind of data.

This tutorial will walk you through the differences between fixed and random effect models with a focus on understanding how we are synthesising different effect sizes across studies. At the end of this tutorial I hope you understand:

  1. The distinct difference between fixed and random effect meta-analytic models
  2. How to fit these models using metafor (the most popular package for meta-analysis in R)
  3. How to interpret the output from each model
  4. How you can reproduce the results of the metafor models yourself

Simulating some meta-analytic data

To demonstrate, we can simulate some data. This is also quite useful for getting a handle on what the difference between fixed and random effects meta-analytic models looks like. Either way, we’ll explore the data in depth as we move through the tutorial.

# Generate some simulated effect size data with known sampling variance assumed
# to come from a common underlying distribution
set.seed(86)  # Set see so that we all get the same simulated results
# We will have 5 studies
stdy <- 1:5
# We know the variance for each effect
Ves <- c(0.05, 0.1, 0.02, 0.1, 0.09)
# We'll need this later but these are weights
W <- 1/Ves
# We assume they are sampled from a normal distribution with a mean effect size
# of 2
es <- rnorm(length(Ves), 2, sqrt(Ves))
# Data for our fixed effect meta-analysis
dataFE <- data.frame(stdy = stdy, es, Ves)

# Generate a second set of effect sizes, but now assume that each study effect
# does not come from the same distribution, but from a population of effect
# sizes.

# Here adding 0.8 says we want to add 0.8 as the between study variability. In
# other words, each effect size is sampled from a larger distribution of effect
# sizes that itself comes from a distribution with a variance of 0.8.
esRE <- rnorm(length(Ves), 2, sqrt(Ves + 0.8))
# Data for our random effect meta-analysis
dataRE <- data.frame(stdy = stdy, esRE, Ves)

We can get a look at what these two datasets we simulated above look like (Figure 1). The red circles are effect sizes and their standard errors (square root of the sampling error variance) from our fixed effect meta-analysis (FE) data set. In contrast, the black circles are our effect size and standard errors from the data generated in the random effect meta-analysis (RE) dataset. The black line is the average, true effect size (which we have set to the value 2).

Mean (arrows are sampling standard deviation) effect size for each study. Data simulated under a **fixed effect model in black** and data simulated under a **random effect model in red**.

Figure 1: Mean (arrows are sampling standard deviation) effect size for each study. Data simulated under a fixed effect model in black and data simulated under a random effect model in red.

We notice a few important differences here. In the RE data set the variance across the studies is a lot larger compared to the FE data set. This is what we would expect because we have added in a between study variance. Now, let’s use a common meta-analysis package, metafor, to analyse these data sets.

Fixed Effect Meta-Analysis

What is a fixed effect meta-analysis anyway? If it wasn’t clear from the above simulated data lets be more formal. We simulated data according to the following model, a fixed effects meta-analytic model:

\[ y_{i} = \mu + m_{i} \\ m_{i} \sim N(0, v_{i}) \] where \(y_{i}\) is the ith effect size estimate and \(m_{i}\) is the sampling error (deviation from \(\mu\)) for effect size i. Sampling deviations are assumed to be drawn from a normal distribution (N) with a mean of 0 and a sampling variance, \(v_{i}\). You may notice from this model, and the simulated data, that we are assuming that there is a single overall mean from the pooled effect sizes we have, that effects are assumed to be independently, and identically distributed, and that the only deviation from \(\mu\) is the result of sampling error, which is related to the sample size or power of the study. There are no additional sources of variance added to effect size estimates (Figure 2a).

Fixed effect meta-analysis with metafor

Now that we understand the fixed effect model, both from formal statistical notation and by looking at how the data were generated, lets fit a fixed effect model in metafor.

# Run a fixed effect meta-analysis using the FE dataset.
metafor::rma(yi = es, vi = Ves, method = "FE", data = dataFE)
## 
## Fixed-Effects Model (k = 5)
## 
## I^2 (total heterogeneity / total variability):   0.00%
## H^2 (total variability / sampling variability):  0.56
## 
## Test for Heterogeneity:
## Q(df = 4) = 2.2340, p-val = 0.6928
## 
## Model Results:
## 
## estimate      se     zval    pval   ci.lb   ci.ub     ​ 
##   2.0731  0.0994  20.8459  <.0001  1.8782  2.2680  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

How to interpret the model output?

Task

What do the model results and Q values (under Test for Heterogeneity) mean?


Answer

The output suggests that the average estimate is 2.07 and it has a standard error of 0.1. It also provides us with a Q statistic, which measures the amount of heterogeneity among effect sizes. If Q is large and the p-value is small then it suggests substantial heterogeneity among effect sizes beyond what we would expect if these effects were generated from a common underlying distribution. This is a major assumption of fixed effect meta-analyses and in this case is supported. This is a good thing because we specifically generated these data under the assumption that they are from the same distribution. We can see this if we look at how we generated the data: rnorm(length(Ves), 2, sqrt(Ves)). This draws random effect sizes from a normal distribution with a variance (sqrt(Ves)) for each effect size that is only defined by its sampling variability.


Fixed effect meta-analysis by hand!

What is the model above doing though? and what is the logic behind the calculations? The best way to understand this is to hand calculate these values. Basically the effect sizes are being weighted by their sampling error variance when deriving the pooled estimate and it’s variance. Let’s calculate this by hand and see what’s happening:

# Calculate pooled effect size
EsP.FE <- sum(W * dataFE$es)/sum(W)
EsP.FE
## [1] 2.073
# Calculate the pooled variance around estimate
VarEsP.FE <- 1/sum(W)
VarEsP.FE
## [1] 0.00989
# Calculate the standard error around estimate
SE.EsP.FE <- sqrt(VarEsP.FE)
SE.EsP.FE
## [1] 0.09945

Wow! This is so cool. We just did a fixed effect meta-analysis by hand…and, look, it matches the model output perfectly. The math is not so scary after all! But what about this extra statistic, Q? How do we derive this?

# Now lets calculate our Q. Q is the total amount of heterogeneity in our data.
Q.fe <- sum(W * (dataFE$es^2)) - (sum(W * dataFE$es)^2/sum(W))
Q.fe
## [1] 2.234

Cool. So this value also matches up nicely. If you didn’t catch all that, we can summarise it in Table 1 below.

Random Effect Meta-Analysis

Now lets turn to the formal definition of a random effects meta-analysis. It looks at the outset as being very similar to our fixed effect model, but with a critically important addition: \[ y_{i} = \mu + s_{i} + m_{i} \\ m_{i} \sim N(0, v_{i}) \\ s_{i} \sim N(0, \tau^2) \] Again, \(y_{i}\) is the ith effect size estimate and \(m_{i}\) is the sampling error (deviation from \(\mu\)) for effect size i. Sampling deviations are assumed to be drawn from a normal distribution (N) with a mean of 0 and a sampling variance, \(v_{i}\). However, you will notice now that we have an additional source of variation, \(s_{i}\), which is the study specific deviation. Here, we are assuming now that every study has it’s own mean effect size which is again sampled from a normal distribution with a between study variance, \(\tau^2\) (read, tau-squared).

You’ll notice from this model, and the simulated data, that we are again assuming that there is a single overall mean and that effects are independently, and identically, distributed. However, now any deviation from the true mean is the result of not only \(\mu\) (sampling error) but also because studies vary in their true mean effect sizes as well. Again, beyond these two sources, there are no additional sources of variance added to effect size estimates.

Visualisation of a fixed (a) and random effects (b) meta-analysis. From Nakagawa et al. 2017.

Figure 2: Visualisation of a fixed (a) and random effects (b) meta-analysis. From Nakagawa et al. 2017.

Random effect meta-analysis using metafor

Now lets move on to a more complicated situation, a random effect meta-analysis using the RE dataset. Remember, we know from this data set that each effect size comes from a different overall distribution. How might Q change? (Hint: We expect it to get larger!)

metafor::rma(yi = esRE, vi = Ves, method = "REML", data = dataRE)
## 
## Random-Effects Model (k = 5; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 0.3331 (SE = 0.2843)
## tau (square root of estimated tau^2 value):      0.5771
## I^2 (total heterogeneity / total variability):   85.22%
## H^2 (total variability / sampling variability):  6.76
## 
## Test for Heterogeneity:
## Q(df = 4) = 24.4015, p-val < .0001
## 
## Model Results:
## 
## estimate      se    zval    pval   ci.lb   ci.ub     ​ 
##   2.0199  0.2837  7.1199  <.0001  1.4639  2.5759  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpreting what happened!

Task

What does it mean when the Q increased in our random effect model vs our fixed effect model?


Answer

That’s because we added in extra variability to the mix because each effect size is to come from a distribution of true effect sizes. Because they can all have their own mean they can differ substantially from the other effect sizes and this results in more variance (i.e. heterogeneity) being added to our data over all.


Random effect meta-analysis by hand!

Now lets see if we can reproduce these results. We need to remember that we need to add in the between study variance to our weighting of effect sizes for this model. To do this, we need to estimate how much heterogeneity we have between studies above what we would expect from sampling variability. This can be estimated by calculating the \(\tau^2\) statistic, which is the variance in the ‘true’ effect sizes. To calculate this we need to calculate Q using our weights, which are the same because we know these to be true.

# Calculate our Q statistic again
Q <- sum(W * (dataRE$es^2)) - (sum(W * dataRE$es)^2/sum(W))
Q
## [1] 24.4
# Calculate tau2
C <- sum(W) - ((sum(W^2))/sum(W))
C
## [1] 69.23
# Calculate df
df <- nrow(dataRE) - 1
T2 <- (Q - df)/C
T2
## [1] 0.2947

Now we can re-calculate our weights by adding in the between study heterogenetity.

# RE weights
W.re <- 1/(T2 + dataRE$Ves)

# Pooled effect size for random effect meta-analysis
esPoolRE <- sum(W.re * dataRE$es)/sum(W.re)
esPoolRE
## [1] 2.016
# Calculate the pooled variance around estimate
VarES <- 1/sum(W.re)

# Calculate the standard error around estimate
SE.ES.RE <- sqrt(VarES)
SE.ES.RE
## [1] 0.2697

Now we can compare the metafor model output with our hand calculations

Different optimizers give different answers

For those of you who were paying close attention you would have noticed that the numbers from our metafor models don’t actually match up (Table 2). Well spotted!

What happened above then? Why are our numbers not matching the metafor output? I thought we knew what we were doing? Our Q statistic is correct but \(\tau^2\), our mean estimate and its standard error are different. Why?

This is because the metafor model we ran is using a different estimation method (REML – Restricted Maximum Likelihood) to estimate these statistics compared to our hand calculations. But, we are awfully close on all our estimates in any case (see Table 2)! However, we can re-run this all using the method we used for our hand calculations above and get a nice match between our calculations and the models.

metafor::rma(yi = esRE, vi = Ves, method = "DL", data = dataRE)
## 
## Random-Effects Model (k = 5; tau^2 estimator: DL)
## 
## tau^2 (estimated amount of total heterogeneity): 0.2947 (SE = 0.2731)
## tau (square root of estimated tau^2 value):      0.5429
## I^2 (total heterogeneity / total variability):   83.61%
## H^2 (total variability / sampling variability):  6.10
## 
## Test for Heterogeneity:
## Q(df = 4) = 24.4015, p-val < .0001
## 
## Model Results:
## 
## estimate      se    zval    pval   ci.lb   ci.ub     ​ 
##   2.0163  0.2697  7.4753  <.0001  1.4876  2.5449  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now if we compare the metafor model with our hand calculations we see they match exactly! (Table 3).

Conclusions

Hopefully now you have a better understanding of the differences between fixed and random-effect meta-analysis. It should also be clear that meta-analytic models can be easily understood in detail but that results can vary depending on what assumptions and optimizers we are using to obtained pooled results. It’s also comforting to know that we can do our own analysis by hand!

References

Borenstein, M., Hedges, L. V., Higgens, J. P. T. and Rothstein, H. R. (2009). Introduction to meta-analysis. John Wily & Sons, Ltd, West Sussex, UK.
Gurevitch, J. and Hedges, L. V. (1999). Statistical issues in ecological meta-analyses. Ecology 80, 1142–1149.
Gurevitch, J., Koricheva, J., Nakagawa, S. and Stewart, G. (2018). Meta-analysis and the science of research synthesis. Nature 555, 176–182.
Koricheva, J., Gurevitch, J. and Mengersen, K. (2013). Handbook of meta-analysis in ecology and evolution. Princeton University Press, Princeton, New Jersey.
Nakagawa, S. and Santos, E. S. (2012). Methodological issues and advances in biological meta-analysis. Evolutionary Ecology 26, 1253–1274.
Nakagawa, S., Noble, D. W. A., Senior, A. M. and Lagisz, M. (2017). Meta-evaluation of meta-analysis: Ten appraisal questions for biologists. BMC Biology 15–18, DOI 10.1186/s12915-017-0357-7.


Session Information

R version 4.0.3 (2020-10-10)

Platform: x86_64-apple-darwin17.0 (64-bit)

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: magick(v.2.7.2), vembedr(v.0.1.5), equatags(v.0.1.0), mathjaxr(v.1.6-0), pander(v.0.6.4), orchaRd(v.2.0), forcats(v.0.5.1), stringr(v.1.4.0), dplyr(v.1.0.9), purrr(v.0.3.4), readr(v.2.1.2), tidyr(v.1.2.0), tibble(v.3.1.6), ggplot2(v.3.3.5), tidyverse(v.1.3.1), flextable(v.0.6.6), metafor(v.3.4-0), metadat(v.0.2-0) and Matrix(v.1.3-4)

loaded via a namespace (and not attached): nlme(v.3.1-152), fs(v.1.5.2), lubridate(v.1.8.0), httr(v.1.4.2), tools(v.4.0.3), backports(v.1.4.1), bslib(v.0.3.1), utf8(v.1.2.2), R6(v.2.5.1), DBI(v.1.1.2), colorspace(v.2.0-3), withr(v.2.5.0), tidyselect(v.1.1.2), curl(v.4.3.2), compiler(v.4.0.3), cli(v.3.3.0), rvest(v.1.0.2), formatR(v.1.11), pacman(v.0.5.1), xml2(v.1.3.3), officer(v.0.3.18), bookdown(v.0.22), sass(v.0.4.1), scales(v.1.2.0), rappdirs(v.0.3.3), systemfonts(v.1.0.2), digest(v.0.6.29), rmarkdown(v.2.13), base64enc(v.0.1-3), pkgconfig(v.2.0.3), htmltools(v.0.5.2), highr(v.0.9), dbplyr(v.2.1.1), fastmap(v.1.1.0), rlang(v.1.0.2), readxl(v.1.4.0), rstudioapi(v.0.13), jquerylib(v.0.1.4), generics(v.0.1.2), jsonlite(v.1.8.0), zip(v.2.2.0), magrittr(v.2.0.3), Rcpp(v.1.0.8.3), munsell(v.0.5.0), fansi(v.1.0.3), gdtools(v.0.2.3), lifecycle(v.1.0.1), stringi(v.1.7.6), yaml(v.2.3.5), grid(v.4.0.3), crayon(v.1.5.1), lattice(v.0.20-44), haven(v.2.4.3), hms(v.1.1.1), knitr(v.1.39), klippy(v.0.0.0.9500), pillar(v.1.7.0), uuid(v.1.0-4), reprex(v.2.0.1), xslt(v.1.4.3), glue(v.1.6.2), evaluate(v.0.15), data.table(v.1.14.2), remotes(v.2.4.2), modelr(v.0.1.8), png(v.0.1-7), vctrs(v.0.4.1), tzdb(v.0.3.0), cellranger(v.1.1.0), gtable(v.0.3.0), assertthat(v.0.2.1), locatexec(v.0.1.1), xfun(v.0.30), broom(v.0.7.12) and ellipsis(v.0.3.2)