Usage of brunnermunzel package

Toshiaki Ara

2022-08-07

Introduction

This brunnermunzel package is to perform (permuted) Brunner-Munzel test for stochastic equality of two samples, which is also known as the Generalized Wilcoxon test.

For Brunner-Munzel test (Brunner and Munzel 2000), brunner.munzel.test function in lawstat package is very famous. This function is extended to enable to use formula, matrix, and table as an argument.

Also, the function brunnermunzel.permutation.test for permuted Brunner-Munzel test (Neubert and Brunner 2007) was provided.

Usage of functions in brunnermunzel package

Default and Formula class

Sample data

In this section, we will use sample data from Hollander & Wolfe (1973), 29f. – Hamilton depression scale factor measurements in 9 patients with mixed anxiety and depression, taken at the first (x) and second (y) visit after initiation of a therapy (administration of a tranquilizer)“.

x <- c(1.83,  0.50,  1.62,  2.48, 1.68, 1.88, 1.55, 3.06, 1.30)
y <- c(0.878, 0.647, 0.598, 2.05, 1.06, 1.29, 1.06, 3.14, 1.29)

For formula interface, data was converted to data.frame.

dat <- data.frame(
    value = c(x, y),
    group = factor(rep(c("x", "y"), c(length(x), length(y))),
                   levels = c("x", "y")))
library(dplyr)
dat %>%
    group_by(group) %>%
    summarize_all(list(mean = mean, median = median))
#> # A tibble: 2 x 3
#>   group  mean median
#>   <fct> <dbl>  <dbl>
#> 1 x      1.77   1.68
#> 2 y      1.33   1.06

Analysis

analysis with Brunner-Munzel test

library(brunnermunzel)

brunnermunzel.test(x, y)
#> 
#>  Brunner-Munzel Test
#> 
#> data:  x and y
#> Brunner-Munzel Test Statistic = -1.4673, df = 15.147, p-value = 0.1628
#> 95 percent confidence interval:
#>  -0.02962941  0.59753064
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>        0.2839506

brunnermunzel.test(value ~ group, data = dat)
#> 
#>  Brunner-Munzel Test
#> 
#> data:  value by group
#> Brunner-Munzel Test Statistic = -1.4673, df = 15.147, p-value = 0.1628
#> 95 percent confidence interval:
#>  -0.02962941  0.59753064
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>        0.2839506

analysis with permuted Brunner-Munzel test

To perform permuted Brunner-Munzel test, use brunnermunzel.test with “perm = TRUE” option, or brunnermunzel.permutation.test function. This “perm” option is used in also formula interface, matrix, and table.

When perm is TRUE, brunnermunzel.test calls brunnermunzel.permutation.test in internal.

brunnermunzel.test(x, y, perm = TRUE)
#> 
#>  permuted Brunner-Munzel Test
#> 
#> data:  x and y
#> p-value = 0.1581
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>        0.2839506

brunnermunzel.permutation.test(x, y)
#> 
#>  permuted Brunner-Munzel Test
#> 
#> data:  x and y
#> p-value = 0.1581
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>        0.2839506

Because statistics in all combinations are calculated in permuted Brunner-Munzel test (\({}_{n_{x}+n_{y}}C_{n_{x}}\) where \(n_{x}\) and \(n_{y}\) are sample size of \(x\) and \(y\), respectively), it takes a long time to obtain results.

Therefore, when sample size is too large [the number of combination is more than 40116600 (\(=\) choose(28, 14))], it switches to Brunner-Munzel test automatically.

# sample size is 30
brunnermunzel.permutation.test(1:15, 3:17)
#> Warning in brunnermunzel.permutation.test.default(1:15, 3:17): Sample number is too large. Using 'brunnermunzel.test'
#> 
#>  Brunner-Munzel Test
#> 
#> data:  x and y
#> Brunner-Munzel Test Statistic = 1.1973, df = 28, p-value = 0.2412
#> 95 percent confidence interval:
#>  0.4115330 0.8373559
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>        0.6244444

using force option

When you want to perform permuted Brunner-Munzel test regardless sample size, you add “force = TRUE” option to brunnermunzel.permutation test.

brunnermunzel.permutation.test(1:15, 3:17, force = TRUE)
#>
#>  permuted Brunner-Munzel Test
#>
#> data:  1:15 and 3:17
#> p-value = 0.2341

using alternative option

brunnermunzel.test also can use “alternative” option as well as t.test and wilcox.test functions.

To test whether the average rank of group \(x\) is greater than that of group \(y\), alternative = "greater" option is added. In contrast, to test whether the average rank of group \(x\) is lesser than that of group \(y\), alternative = "less" option is added.

The results of Brunner-Munzel test and Wilcoxon sum-rank test (Mann-Whitney test) with alternative = "greater" option are shown. In this case, median of \(x\) is 1.68, and median of \(y\) is 1.06.

brunnermunzel.test(x, y, alternative = "greater")
#> 
#>  Brunner-Munzel Test
#> 
#> data:  x and y
#> Brunner-Munzel Test Statistic = -1.4673, df = 15.147, p-value = 0.08138
#> 95 percent confidence interval:
#>  -0.02962941  0.59753064
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>        0.2839506

wilcox.test(x, y, alternative = "greater")
#> Warning in wilcox.test.default(x, y, alternative = "greater"): cannot compute
#> exact p-value with ties
#> 
#>  Wilcoxon rank sum test with continuity correction
#> 
#> data:  x and y
#> W = 58, p-value = 0.06646
#> alternative hypothesis: true location shift is greater than 0

When using formula, brunnermunzel.test with alternative = "greater" option tests an alternative hypothesis “1st level is greater than 2nd level”.

In contrast, brunnermunzel.test with alternative = "less" option tests an alternative hypothesis “1st level is lesser than 2nd level”.

dat$group
#>  [1] x x x x x x x x x y y y y y y y y y
#> Levels: x y
brunnermunzel.test(value ~ group, data = dat, alternative = "greater")$p.value
#> [1] 0.08137809

wilcox.test(value ~ group, data = dat, alternative = "greater")$p.value
#> Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
#> compute exact p-value with ties
#> [1] 0.06645973
brunnermunzel.test(x, y, alternative = "less")$p.value
#> [1] 0.9186219

wilcox.test(x, y, alternative = "less")$p.value
#> Warning in wilcox.test.default(x, y, alternative = "less"): cannot compute exact
#> p-value with ties
#> [1] 0.9442044

using est option

Normally, brunnermunzel.test and brunnermunzel.permutation test return the estimate \(P(X<Y) + 0.5 \times P(X=Y)\). When ‘est = "difference"’ option is used, these functions return mean difference [\(P(X<Y) - P(X>Y)\)] in estimate and confidence interval.

Note that \(P(X<Y) - P(X>Y) = 2p - 1\) when \(p = P(X<Y) + 0.5 \times P(X=Y)\).

This change is proposed by Dr. Julian D. Karch.

brunnermunzel.test(x, y, est = "difference")
#> 
#>  Brunner-Munzel Test
#> 
#> data:  x and y
#> Brunner-Munzel Test Statistic = -1.4673, df = 15.147, p-value = 0.1628
#> 95 percent confidence interval:
#>  -1.0592588  0.1950613
#> sample estimates:
#> P(X<Y)-P(X>Y) 
#>    -0.4320988

brunnermunzel.permutation.test(x, y, est = "difference")
#> 
#>  permuted Brunner-Munzel Test
#> 
#> data:  x and y
#> p-value = 0.1581
#> sample estimates:
#> P(X<Y)-P(X>Y) 
#>    -0.4320988

Matrix and Table class

In some case, data is provided as aggregated table. Both brunnermunzel.test and brunnermunzel.permutation.test accept data of matirix and table class.

Fictional data
Normal Moderate Severe
A 5 3 2
B 1 3 6

Sample data

dat1 <- matrix(c(5, 3, 2, 1, 3, 6), nr = 2, byrow = TRUE)
dat2 <- as.table(dat1)
colnames(dat2) <- c("Normal", "Moderate", "Severe")
dat1  # matrix class
#>      [,1] [,2] [,3]
#> [1,]    5    3    2
#> [2,]    1    3    6

dat2  # table class
#>   Normal Moderate Severe
#> A      5        3      2
#> B      1        3      6

Analysis

analysis with Brunner-Munzel test

brunnermunzel.test(dat1)
#> 
#>  Brunner-Munzel Test
#> 
#> data:  Group1 and Group2
#> Brunner-Munzel Test Statistic = 2.4447, df = 17.394, p-value = 0.02542
#> 95 percent confidence interval:
#>  0.5359999 0.9840001
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>             0.76

brunnermunzel.test(dat2)
#> 
#>  Brunner-Munzel Test
#> 
#> data:  A and B
#> Brunner-Munzel Test Statistic = 2.4447, df = 17.394, p-value = 0.02542
#> 95 percent confidence interval:
#>  0.5359999 0.9840001
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>             0.76

analysis with permuted Brunner-Munzel test

brunnermunzel.permutation.test(dat1)
#> 
#>  permuted Brunner-Munzel Test
#> 
#> data:  Group1 and Group2
#> p-value = 0.05116
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>             0.76

brunnermunzel.permutation.test(dat2)
#> 
#>  permuted Brunner-Munzel Test
#> 
#> data:  A and B
#> p-value = 0.05116
#> sample estimates:
#> P(X<Y)+.5*P(X=Y) 
#>             0.76

About program

brunnermunzel.test function

brunnermunzel.test function is derived from brunner.munzel.test function in lawstat package (Maintainer of this package is Vyacheslav Lyubchich; License is GPL-2 or GPL-3) with modification. The authors of this function are Wallace Hui, Yulia R. Gel, Joseph L. Gastwirth and Weiwen Miao.

combination subroutine by FORTRAN77

FORTRAN subroutine combination in combination.f is derived from the program by shikino (http://slpr.sakura.ne.jp/qp/combination)(CC-BY-4.0) with slight modification.

Without this subroutine, I could not make brunnermunzel.permutation.test. Thanks to shikono for your useful subroutine.

References

Brunner, E, and Munzel. 2000. “The Nonparametric Behrens-Fisher Problem: Asymptotic Theory and a Small-Sample Approximation.” Biometrical Journal 42 (1): 17–25.
Neubert, K, and E Brunner. 2007. “A Studentized Permutation Test for the Non-Parametric Behrens-Fisher Problem.” Computational Statistics and Data Analysis 51 (10): 5192–5204.