We have already introduced the power of multilevel meta-analytic models for dealing with non-independence, such as shared species, phylogeny and study (Hadfield and Nakagawa, 2009; Nakagawa and Santos, 2012; Noble et al., 2017), but meta-analyses often contain even more complex forms on non-independence.

We might have many effect size estimates from a single study because many traits are measured on the **same sample of organisms**, or many treatments are applied and we can create an **effect size with some common control** (Noble et al., 2017). This adds complexity to the data set and also results in special types of non-independence that are somewhat unique to meta-analysis – especially when using contrast-based effect sizes like Hedges’ g, log response ratios, log odds ratios etc (Noble et al., 2017).

In this tutorial, we’ll overview some of these unique forms of non-independence discussing some of the ways that they can be dealt with using different approaches. Often there are no simple solutions because information collected to derive effect sizes are often lacking important details that would allow some forms of dependence to be dealt with. Thankfully, these problems have been thought about for some time by many meta-analysts and there are solutions that work reasonably well (Gleser and Olkin, 2009; Hedges, 2019; Hedges et al., 2010; Lajeunesse, 2011; Pustejovsky and Tipton, 2021; Tipton, 2013).

Let’s briefly overview why non-independence can be an issue and why simple multilevel meta-analytic models do not always deal with the problem sufficiently well. Recall our multi-level meta-analytic (MLMA) model from the last tutorial.

\[ y_{i} = \mu + s_{j[i]} + spp_{k[i]} + e_{i} + m_{i} \\ m_{i} \sim N(0, v_{i}) \\ s_{j} \sim N(0, \tau^2) \\ spp_{k} \sim N(0, \sigma_{k}^2) \\ e_{i} \sim N(0, \sigma_{e}^2) \]

We hid some important notation to avoid drawing attention to it initially. The notation we neglected to mention is now included below:

\[ \\ m_{i} \sim N(0, v_{i}\textbf{I}) \\ s_{j} \sim N(0, \tau^2\textbf{I}) \\ spp_{k} \sim N(0, \sigma_{k}^2\textbf{I}) \\ e_{i} \sim N(0, \sigma_{e}^2\textbf{I}) \] You’ll notice that we added in \(\bf{I}\), but what is \(\bf{I}\)? \(\bf{I}\) is called the identity matrix. It is a matrix that has rows and columns equal to the number of effect size values (if at the effect level) or the number of species or studies (if at the study or species-level) in the data set. The identity matrix contains 1’s on the diagonal and 0’s in the off-diagonals. For a data set with 3 effect size estimates the identity matrix looks like this:

\[ \bf{I} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \]

What is the significance of multiplying the \(\bf{I}\) matrix by the variance for each random effect above?

The short story is that we are making the assumption that each random effect added to a given effect size is drawn from an independent and identical distribution. To see why this is the case think about what happens when we multiply \(\sigma_{i}^2\) by this matrix. Along the diagonal, we get the exact same variance estimate for every single effect size (or for effect sizes that share a cluster). This tells us that they are sampled from a distribution that has the same mean (0) and variance (\(\sigma_{i}^2\)) (i.e., are sampled from the same distribution).

If, however, we multiply the off-diagonals by the variance we get zero. The fact that these off-diagonals are 0 tells us straight away that all the random effects being sampled are independent from one another as otherwise the off-diagonals would be non-zero (i.e., would have a co-variance) (and the correlation matrix would have a correlation above 0). This seems sensible. All effect size estimates from the same study will have the same ‘effect’ added to the estimate. This indicates that these effects should be more similar to each other by virtue of them being from the same study (Fig. 1b from Noble et al. (2017)). This does, in some sense, deal with non-independence, but not completely!

Our assumption about effects from the same study being independent would likely be wrong if, for example, we have effect sizes that are measured on different traits of the same animals that vary in their relationship with each other (i.e., some traits are more strongly correlated than others). Alternatively, if our effect size statistics were collected at different times or spatial locations that vary in the extent to which they are correlated with each other. In fact, species themselves vary to different degrees based on how long ago they shared a common ancestor. This is why we include evolutionary relationships among species in our meta-analytic models! (we’ll discuss this in the phylogeny tutorial (Chamberlain et al., 2012; Hadfield and Nakagawa, 2009; Nakagawa and Santos, 2012) (Fig. 1a from Noble et al. (2017)). In these cases, we need to properly model the correlation among effect size values induced by these processes.

But, wait, there’s more. In fact, things can get very complicated, as outlined by Noble et al. (2017)! (See Fig. 1). If we have used the same data to calculate multiple effect size values then we are also inducing a correlation among their sampling errors (Gleser and Olkin, 2009; Hedges, 2019; Lajeunesse, 2011; Noble et al., 2017). That could be a serious problem because the inverse sampling errors are what we are using as weights in our analysis. If there are correlations among sampling errors then we need to account for this in our sampling variance matrix. We can do this but also modifying out sampling error matrix

\[ m_{i} \sim N(0, \textbf{M}) \]

The new sampling variance matrix, \(\textbf{M}\), now has the sampling variances along the diagonal, which we know because we can calculate these. But, how do we now know what the covariance between the sampling errors are to be put in the off-diagonals of this \(\textbf{M}\) matrix? Luckily there are some approximations for us that we can use to construct the entire \(\textbf{M}\) sampling (co)variance matrix (Gleser and Olkin, 2009; Hedges, 2019; Lajeunesse, 2011).

You might be thinking, so what? Yes, we have all these sources of non-independence, but who cares? Why is it important that we seriously consider the dependency? There are a few reasons:

- First, we want to make sure we get the ‘weight’ matrix correct so that we get a good estimate of the overall meta-analytic mean
- Second, ignoring the dependency structure will result in much narrower confidence intervals giving us a false impression that we have much greater confidence in the mean estimate than what we actually have.
- Third, and this relates to point 2 above, it will mess up with our inferential statistics. We are more likely to make Type I errors. Narrower standard errors are going to result in larger test statistics and this in combination with our over-inflated degrees of freedom will tell us our results are significant when in fact they are not.