Meta-analyst’s have worked hard to develop tools that can be used to try and understand different forms of publication practices and biases within the scientific literature. Such biases can occur if studies reporting non-significant or opposite results to what is predicted are not found in systematic searches [‘i.e., the ’file-drawer’ problem; Jennions et al. (2013)]. Alternatively, biases could result from selective reporting or ‘p-hacking’.
Visual and quantitative tools have been developed try and identify and ‘correct’ for such biases on meta-analytic results (Jennions et al., 2013; Nakagawa et al., 2022; Rothstein et al., 2005). Having said that, aside from working hard to try and incorporate ‘gray literature’ (unpublished theses, government reports, etc.) and working hard to include work done in non-English speaking languages, there is little one can truly due to counteract publication biases beyond a few simple tools (which all have limitations in themselves). We cannot know for certain what isn’t published in many cases or how a sample of existing work on a topic might be biased. Nonetheless, exploring the possibility of publication bias and its possible effects on conclusions is a core component of meta-analysis (O’Dea et al., 2021).
In this tutorial, we’ll overview some ways we can attempt to understand whether publication bias is present or not using visual tools. In the next tutorial, we will cover some analytical approaches that might be used as a sensitivity analysis to explicitly test whether publication bias is present and attempt to to estimate how this changes the effect size if it didn’t exist. Of course, often we will never know whether such biases exist and high heterogeneity can result in apparent publication bias when non exist. The goal here is to formally play a thought experiment: if publication bias were to exist what form would it be expected to take and how would our conclusions change if we were to have access to all available studies regardless of significance or power?
We’re going to have a look at a meta-analysis by Arnold et al. (2021) that explores the relationship between resting metabolic rate and fitness in animals. Publication bias is slightly subtle in this particular meta-analysis, but it does appear to be present in some form both visually and analytically. We’ll start off this tutorial just visually exploring for evidence of publication bias and dicuss what it might look like and why.
# Packages
pacman::p_load(tidyverse, metafor, orchaRd)
# Download the data. Exclude NA in r and sample size columns
arnold_data <- read.csv("https://raw.githubusercontent.com/pieterarnold/fitness-rmr-meta/main/MR_Fitness_Data_revised.csv")
# Exclude some NA's in sample size and r
arnold_data <- arnold_data[complete.cases(arnold_data$n.rep) & complete.cases(arnold_data$r),
]
# Calculate the effect size, ZCOR
arnold_data <- metafor::escalc(measure = "ZCOR", ri = r, ni = n.rep, data = arnold_data,
var.names = c("Zr", "Zr_v"))
# Lets subset to endotherms for demonstration purposes
arnold_data_endo <- arnold_data %>%
mutate(endos = ifelse(Class %in% c("Mammalia", "Aves"), "endo", "ecto")) %>%
filter(endos == "endo" & Zr <= 3) # Note that one sample that was an extreme outlier was removed in the paper.
# Add in observation-level (residual)
arnold_data_endo$obs <- 1:dim(arnold_data_endo)[1]
Funnel plots are by far the most common visual tool for assessing the possibility of publication bias (Nakagawa et al., 2022). Just like any exploratory analysis, these are just visual tools. Let’s have a look at a funnel plot of the data. Funnel plots plot the the effect size (x-axis) against some form of uncertainty around the effect size, such as sampling variance or precision (y-axis). While we acknowledge that many other types of plots exist to explore the possibility for publication bias (Jennions et al., 2013; Nakagawa et al., 2022; Rothstein et al., 2005), we will only cover this more common type.
If no publication bias exists then we would expect the plot to look fairly symmetrical and funnel shaped (hence why it’s called a funnel plot!). The reason why the shape is a funnel is because the sampling variance is expected to decrease (or the precision increase) when the sample size, and thus power, increases. These ‘high-powered’ studies are at the top of the ‘funnel’ in the narrow-necked region, so to say, because we expect the effect size form these studies to fluctuate very little based on sampling process. In contrast, as the power of studies decrease, and therefore their sampling variance increases, we expect the spread of effect sizes to increase simply because small sample sizes results in greater variability of effects and effects that are larger in magnitude (by chance alone).
# Lets make a funnel plot to visualize the data in relation to the precision,
# inverse sampling standard error,
metafor::funnel(x = arnold_data_endo$Zr, vi = arnold_data_endo$Zr_v, yaxis = "seinv",
digits = 2, level = c(0.1, 0.05, 0.01), shade = c("white", "gray55", "gray 75"),
las = 1, xlab = "Correlation Coefficient (r)", atransf = tanh, legend = TRUE)