This R package mc.heterogeneity provides functions for testing between-study heterogeneity in meta-analysis of standardized mean differences (d), Fisher-transformed Pearson’s correlations (r), and natural-logarithm-transformed odds ratio (OR).

Inclusion of moderators is an option for researchers who are interested in measuring the between-study heterogeneity per se and exploring factors that can explain the systematic between-study heterogeneity.

In the following examples, we describe how to use our package mc.heterogeneity to test the between-study heterogeneity for each of the effect sizes considered in the current study. *Datasets*, *R codes*, and *output* are provided for each example so that applied researchers can easily replicate the examples and modify the codes for their own datasets.

The three example datasets are internal datasets in our package, but researchers can load the datasets using mc.heterogeneity:::[dataset name]. In each of the example datasets, the rows correspond to studies in meta-analysis, and the columns correspond to required input for that study, which includes, but is not limited to effect size, sample size(s), and moderators.

The example R codes adopt the default value for some of the arguments (e.g., default nominal alpha level is 0.05). To change the defaults, use help() to see more details for each of the functions.

The output are formatted to have the same layout.

We also include a “Empirical Illustration” section in the main text of the article to discuss the following examples.

Install the released version of mc.heterogeneity from CRAN with:

`install.packages("mc.heterogeneity")`

And load the library:

`#> IMPORTANT: Please note that functions in this package will soon be deprecated and replaced by functions in package boot.heterogeneity.`

`mc.d()`

is the function to test the between-study heterogeneity in meta-analysis of standardized mean differences (d).

Load the example dataset `selfconcept`

first:

` mc.heterogeneity:::selfconcept selfconcept <-`

Extract the required arguments from `selfconcept`

:

```
# n1 and n2 are lists of samples sizes in two groups
selfconcept$n1
n1 <- selfconcept$n2
n2 <-# g is a list of effect sizes
selfconcept$g g <-
```

If `g`

is a list of biased estimates of standardized mean differences in the meta-analytical study, a small-sample adjustment must be applied:

```
(1-3/(4*(n1+n2-2)-1)) #correct factor to compensate for small sample bias (Hedges, 1981)
cm <- cm*g d <-
```

Run the heterogeneity test using `mc.d()`

:

` mc.d(n1, n2, est = d, model = 'random', p_cut = 0.05) mc.run <-`

Alternatively, such an adjustment could be performed within the function by specifying `adjust = TRUE`

:

` mc.d(n1, n2, est = g, model = 'random', adjust = TRUE, p_cut = 0.05) mc.run2 <-`

`mc.run`

and `mc.run2`

will return the same results:

```
mc.run#> stat p_value Heterogeneity
#> Qtest 23.391659 0.136929 n.s
#> mc.ML 1.610239 0.051200 n.s
#> mc.REML 2.037578 0.053100 n.s
```

```
mc.run2#> stat p_value Heterogeneity
#> Qtest 23.391659 0.136929 n.s
#> mc.ML 1.610239 0.051200 n.s
#> mc.REML 2.037578 0.053100 n.s
```

The first line presents the results of Q-test of a random-effects model. The Q-statistic is Q(df = 17) = 23.39 and the associated p-value is 0.137. Using a cutoff alpha level (i.e., nominal alpha level) of either 0.05 or 0.1, this statistic is n.s (not significant). The homogeneity assumption is not rejected.

The second line presents the results of MC-ML-LRT. The ML-LRT statistic is 1.61 and the Monte-Carlo based p-value is 0.051. The assumption of homogeneity is not rejected with an alpha level of 0.05 but will be rejected at an alpha level of 0.1.

The third line presents the results of MC-REML-LRT. The REML-LRT statistic is 2.04 and the Monte-Carlo based p-value is 0.053. Similar to the results of MC-ML-LR, the assumption of homogeneity is not rejected with an alpha level of 0.05 but will be rejected at an alpha level of 0.1.

Load an hypothetical dataset `hypo_moder`

first:

` mc.heterogeneity:::hypo_moder hypo_moder <-`

Three moderators (cov.z1, cov.z2, cov.z3) are included:

```
head(hypo_moder)
#> n1 n2 d cov.z1 cov.z2 cov.z3
#> 1 59 65 0.8131324 -0.005767173 0.80418951 1.2383041
#> 2 166 165 1.0243732 2.404653389 -0.05710677 -0.2793463
#> 3 68 68 1.5954236 0.763593461 0.50360797 1.7579031
#> 4 44 31 0.6809888 -0.799009249 1.08576936 0.5607461
#> 5 98 95 -1.3017946 -1.147657009 -0.69095384 -0.4527840
#> 6 44 31 -1.9398508 -0.289461574 -1.28459935 -0.8320433
```

Again, run the heterogeneity test using `mc.d()`

with all moderators included in a matrix `mods`

and model type specified as `model = 'mixed'`

:

```
mc.d(n1 = hypo_moder$n1,
mc.run3 <-n2 = hypo_moder$n2,
est = hypo_moder$d,
model = 'mixed',
mods = cbind(hypo_moder$cov.z1, hypo_moder$cov.z2, hypo_moder$cov.z3),
p_cut = 0.05)
```

In the presence of moderators, the results in `mc.run3`

will be different from those in `mc.run`

and `mc.run2`

:

```
mc.run3#> stat p_value Heterogeneity
#> Qtest 31.849952 0.0008061727 sig
#> mc.ML 5.187700 0.0004000000 sig
#> mc.REML 9.283428 0.0004000000 sig
```

In the presence of moderators, the function above tests whether the variability in the true standardized mean differences after accounting for the moderators included in the model is larger than sampling variability alone (Viechtbauer, 2010).

In the first line, the Q-statistic is Q(df = 11) = 31.85 and the associated p-value is 0.0008. This statistic is significant (sig) at an alpha level of 0.05, meaning that the true effect sizes after accounting for the moderators are heterogeneous.

In the second line, the ML-LRT statistic is 5.19 and the Monte-Carlo based p-value is 0.0004. This means that the true effect sizes after accounting for the moderators are heterogeneous at an alpha level of 0.05.

In the third line, the REML-LRT statistic is 9.28 and the Monte-Carlo based p-value is 0.0004. This means that the true effect sizes after accounting for the moderators are heterogeneous at an alpha level of 0.05.

`mc.fcor()`

is the function to test the between-study heterogeneity in meta-analysis of Fisher-transformed Pearson’s correlations (r).

Load the example dataset `sensation`

first:

` mc.heterogeneity:::sensation sensation <-`

Extract the required arguments from `sensation`

:

```
# n is a list of samples sizes
sensation$n
n <-# Pearson's correlation
sensation$r
r <-# Fisher's Transformation
1/2*log((1+r)/(1-r)) z <-
```

Run the heterogeneity test using `mc.fcor()`

:

` mc.fcor(n, z, model = 'random', p_cut = 0.05) mc.run <-`

The test of between-study heterogeneity has the following results:

```
mc.run#> stat p_value Heterogeneity
#> Qtest 29.060970 0.00385868 sig
#> mc.ML 5.204299 0.00420000 sig
#> mc.REML 6.133111 0.00400000 sig
```

In the first line, the Q-statistic is Q(df = 12) = 29.06 and the associated p-value is 0.004. This statistic is significant (sig) at an alpha level of 0.05, meaning that the true effect sizes are heterogeneous.

In the second line, the ML-LRT statistic is 5.20 and the Monte-Carlo based p-value is 0.004. This means that the true effect sizes are heterogeneous at an alpha level of 0.05.

In the third line, the REML-LRT statistic is 6.13 and the Monte-Carlo based p-value is 0.0004. This means that the true effect sizes are heterogeneous at an alpha level of 0.05.

`mc.lnOR()`

is the function to test the between-study heterogeneity in meta-analysis of Natural-logarithm-transformed odds ratio (OR).

Load the example dataset `smoking`

from R package `HSAUR3`

:

```
library(HSAUR3)
#> Loading required package: tools
data(smoking)
```

Extract the required arguments from `smoking`

:

```
# Y1: receive treatment; Y2: stop smoking
00 <- smoking$tc - smoking$qc # not receive treatement yet not stop smoking
n_01 <- smoking$qc # not receive treatement but stop smoking
n_10 <- smoking$tt - smoking$qt # receive treatement but not stop smoking
n_11 <- smoking$qt # receive treatement and stop smoking n_
```

The log odds ratios can be computed, but they are not needed by `mc.lnOR()`

:

```
log(n_11*n_00/n_01/n_10)
lnOR <-
lnOR#> [1] 0.6151856 -0.0235305 0.5658078 0.4274440 1.0814445 0.9109288
#> [7] 0.9647431 0.7103890 1.0375520 -0.1407277 0.7747272 1.7924180
#> [13] 1.2021192 0.3607987 0.2876821 0.2110139 1.2591392 0.1549774
#> [19] 1.3411739 0.2963470 0.6116721 0.3786539 0.5389965 0.7532417
#> [25] 0.5653138 0.3786539
```

Run the heterogeneity test using `mc.lnOR()`

:

` mc.lnOR(n_00, n_01, n_10, n_11, model = 'random', p_cut = 0.05) mc.run <-`

The test of between-study heterogeneity has the following results:

```
mc.run#> stat p_value Heterogeneity
#> Qtest 34.873957 0.09050857 n.s
#> mc.ML 2.557171 0.02160000 sig
#> mc.REML 3.071329 0.02240000 sig
```

In the first line, the Q-statistic is Q(df = 25) = 34.87 and the associated p-value is 0.091. This statistic is not significant (n.s) at an alpha level of 0.05, meaning that the assumption of homogeneity cannot be rejected.

In the second line, the ML-LRT statistic is 2.56 and the Monte-Carlo based p-value is 0.022. This means that the assumption of homogeneity is rejected and the true effect sizes are heterogeneous at an alpha level of 0.05.

In the third line, the REML-LRT statistic is 3.07 and the Monte-Carlo based p-value is 0.022. This means that the assumption of homogeneity is rejected and the true effect sizes are heterogeneous at an alpha level of 0.05.