Metaanalysis quantitatively aggregates effect size data collected from existing research. The effect size one uses for a study question needs to be chosen carefully. Each make certain assumptions about the underlying data. It’s important to be weary of these assumptions and identify when something might have gone wrong.
The effect size chosen must satisfy two important features. First, the effect size must be comparable across studies. This means that they must be placed on the same scale so that they can be aggregated. Some effect size estimates can be interconverted to each other (e.g., Zr, Hedges’ g), but this isn’t the case for all of them. Second, we must be able to derive or approximate the sampling variance for a given effect size. As already indicated, the sampling variance is very important if one wishes to use metaanalytic models that account for sampling variance and is needed to report certain measures of heterogeneity.
Metaanalyses in comparative physiology, and indeed, ecology and evolution more generally, often use highly heterogeneous data (Noble et al., 2022; Senior et al., 2016). This poses other challenges in aggregating effect size data that make their comparability questionable. We’ll discuss some of these issues in later tutorials and identify some ways in which some of this heterogeneity can be dealt with when deriving the effect size itself.
By the end of this tutorial, we hope that you will feel comfortable:
What is an effect size? The answer to this question depends on what kind of study one wishes to synthesise. For example, an effect size could be a contrast between experimental treatments. We might extract mean differences from experiments manipulating, say, bisphenolA (BPA) and looking at the effect of BPA on aquatic ectotherm phenotype relative to some control (Wu and Seebacher, 2020). Alternatively, the effect of interest might be the magnitude and direction of a correlation between two observed variables, such as metabolism and behaviour (Holtmann et al., 2016). At times, we might even just be interested in analysing the mean or variance itself.
“An effect size is a statistical parameter that can be used to compare, on the same scale, the results of different studies in which a common effect of interest has been measured” (Rosenberg et al., 2013)
There are many types of effect sizes that are commonplace in ecological and evolutionary research (Table 1 describes many of the more common ones from Noble et al., 2022).
Effect Measure  Definition  Sampling Variance  Examples 
Mean 

 CTmin, CTmax, LC50, LT50, Metabolic Rate (MR) 
Log Standard deviation, lnSD 

 Variability in CTmin, CTmax, Metabolic Rate (MR) 
Log Response Ratio, lnRR 

 Ratio between pollutant exposed treatment (e.g., BPAexposed group) and control (no pollutant) 
Standardised Mean Difference, SMDa 

 Difference in immune response between males and females, performance difference in the presence of stressor compared to absence of stressor 
Zr (Fisher Transformation of 
 Relationship between sex hormones and immune responses or metabolic rate and behaviour  
a ; a 
We’ll use metafor
to demonstrate how to calculate commonly used effect size estimates in the field of comparative physiology, including Ztransformed correlation coefficients (Zr), log response ratios and Hedge’s g.
To start, well need to load some packages and data. The data is all stored on GitHub. It’s downloaded directly from the web. You won’t need to have these on hand. We’ll first load the packages that we need.
# Load packages install.packages('pacman') # If you haven't already installed
# the pacman package do so with this code
pacman::p_load(metafor, flextable, tidyverse, orchaRd, pander, mathjaxr, equatags)
# To use mathjaxr you need to run equatags::mathjax_install()