cofad

Markus Burkhardt & Johannes Titz

02 März, 2020

Citation

If you use cofad, please cite it in your work as:

Burkhardt, M., & Titz, J. (2019). cofad: Contrast analysis for factorial designs. R package version 0.1.0. https://CRAN.R-project.org/package=cofad

Introduction

Cofad is an R package for conducting COntrast analysis in FActorial Designs like ANOVAs. If contrast analysis was to win a price it would be the one for the most underestimated, underused statistical technique. This is a pitty because in every case a contrast analysis is at least as good as an ANOVA, but in most cases it is actually better. To the question why, the simplest answer is that contrast analysis gets rid off the unspecific omnibus-hypothesis there are differences somewhere and substitutes it with a very specific numerical hypothesis. Furthermore, contrast analysis focuses on effects instead of significance. This is expressed doubly: First, there are three different effect sizes for contrast analysis: \(r_\mathrm{effectsize}\), \(r_\mathrm{contrast}\) and \(r_\mathrm{alerting}\). Second, the effect size refers not to the data but to the tested hypothesis. The larger the effect, the more this speaks for the hypothesis. One can even compare different hypotheses against each other (hello experimentum crucis) by looking at the effect size for each hypothesis.

If we sparked your interest we recommend to read some introductory literature like Rosenthal, Rosnow, & Rubin (2000), Furr (2004), or, for the German-speaking audience, Sedlmeier & Renkewitz (2018). Contrast analysis is actually really easy to understand if you know what a correlation is. In this vignette we assume you know some basics about contrast analysis and want to use it for your own data. We will start with between-subjects designs and then show applications for within designs and mixed designs.

Between-Subjects Designs

Let us first load the package:

library(cofad)

Now we need some data and hypotheses. We can simply take the data from Furr (2004), where we have different empathy ratings of students from different majors.

d <- data.frame(empathy = c(51, 56, 61, 58, 54, 62, 67, 57, 65, 59, 50, 49, 47, 45,
                            44, 50, 45, 40, 49, 41),
                major = as.factor(
                  rep(c("psychology", "education", "business",
                              "chemistry"), each = 5)))
head(d)
#>   empathy      major
#> 1      51 psychology
#> 2      56 psychology
#> 3      61 psychology
#> 4      58 psychology
#> 5      54 psychology
#> 6      62  education

Furr states three hypotheses:

These hypotheses are only mean comparisons, but this is a good way to start. Let’s use cofad to conduct the contrast analysis:

ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = -1,
                                       "business" = 0, "chemistry" = 0),
                    data = d)
ca
#> 
#> Contrast Analysis for between factor design
#> 
#> F(1,16) = 6.154; p = 0.02461087
#> Contrasts:  business = 0; chemistry = 0; education = -1; psychology = 1
#> r_effectsize = -0.276  CAVE: F-Value for opposite contrast

The print method only shows some basic information, but we can use the summary method for more details:

summary(ca)
#> $`F-Table`
#>            SS df     MS     F     p
#> contrast   90  1 90.000 6.154 0.025
#> within    234 16 14.625    NA    NA
#> total    1179 19     NA    NA    NA
#> 
#> $Effects
#>              effects
#> r_effectsize  -0.276
#> r_contrast     0.527
#> r_alerting    -0.309

From this table, \(r_\mathrm{effectsize}\) is probably the most useful statistic. It is just the correlation between the lambdas and the dependent variable, which can also easily by calculated manually:

lambdas <- rep(c(1, -1, 0, 0), each = 5)
cor(d$empathy, lambdas)
#> [1] -0.2762895

The other two hypotheses can be tested accordingly:

ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 0, "education" = 0,
                                       "business" = 1, "chemistry" = -1),
                    data = d)
ca
#> 
#> Contrast Analysis for between factor design
#> 
#> F(1,16) = 0.684; p = 0.42045621
#> Contrasts:  business = 1; chemistry = -1; education = 0; psychology = 0
#> r_effectsize = 0.092
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = 1,
                                       "business" = -1, "chemistry" = -1),
                    data = d)
ca
#> 
#> Contrast Analysis for between factor design
#> 
#> F(1,16) = 57.778; p = 1.07e-06
#> Contrasts:  business = -1; chemistry = -1; education = 1; psychology = 1
#> r_effectsize = 0.847

You will find that the numbers are identical to the ones presented in Furr (2004). Now, imagine we have a more fun hypothesis and not just mean differences. From an elaborate theory we could derive that the means should be 73, 61, 51 and 38. We can test this with cofad directly because cofad will transfer the chosen lambdas into proper lambdas (the mean of the lambdas has to be 0):

ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 73, "education" = 61,
                                       "business" = 51, "chemistry" = 38),
                    data = d)
#> Warning in calc_contrast(dv = empathy, between = major, lambda_between =
#> c(psychology = 73, : lambdas are centered and rounded to 3 digits
ca
#> 
#> Contrast Analysis for between factor design
#> 
#> F(1,16) = 37.466; p = 1.475e-05
#> Contrasts:  business = -4.75; chemistry = -17.75; education = 5.25; psychology = 17.25
#> r_effectsize = 0.682

The manual test shows the same effect size:

lambdas <- rep(c(73, 61, 51, 38), each = 5)
cor(d$empathy, lambdas)
#> [1] 0.6817294

Let us now do an analysis for within-subjects designs.

Within-Subjects Designs

For within designs the calculations are quite different, but cofad takes care of it and we just have to use the within parameters within and lambda_within instead of the between equivalents. As an example we use Table 16.5 from Sedlmeier & Renkewitz (2018). Reading ability was assessed for eight participants under four different conditions. The hypothesis is that you can read best wihout music, white noise reduces your reading ability and music (independent of type) reduces it even further.

d <- data.frame(reading_test = c(27, 25, 30, 29, 30, 33, 31, 35,
                                 25, 26, 32, 29, 28, 30, 32, 34,
                                 21, 25, 23, 26, 27, 26, 29, 31, 
                                 23, 24, 24, 28, 24, 26, 27, 32),
                participant = as.factor(rep(1:8, 4)),
                music = as.factor(rep(c("without music", "white noise", "classic", "jazz"), each = 8)))
head(d)
#>   reading_test participant         music
#> 1           27           1 without music
#> 2           25           2 without music
#> 3           30           3 without music
#> 4           29           4 without music
#> 5           30           5 without music
#> 6           33           6 without music
calc_contrast(dv = reading_test, within = music,
              lambda_within = c("without music" = 1.25, 
                                "white noise" = 0.25, "classic" = -0.75, "jazz" = -0.75),
             ID = participant, data = d)
#> 
#> Contrast Analysis for within factor design
#> 
#> L-Values: Mean =  5.875 ; SD =  3.154
#> t(7) = 5.269; p = 0.00058101
#> Contrasts:  classic = -0.75; jazz = -0.75; white noise = 0.25; without music = 1.25
#> g_contrast = 1.863

You can see that the siginifance test is just a \(t\)-test and the reported effect size is also for a mean comparison (\(g\)). (The \(t\)-test is one-tailed, because contrast analysis has always a specific hypotheses.) When conducting the analysis manually, we can see why:

mtr <- matrix(d$reading_test, ncol = 4)
lambdas <- c(1.25, 0.25, -0.75, -0.75)
lc1 <- mtr %*% lambdas
t.test(lc1)
#> 
#>  One Sample t-test
#> 
#> data:  lc1
#> t = 5.2689, df = 7, p-value = 0.001162
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  3.238361 8.511639
#> sample estimates:
#> mean of x 
#>     5.875

Only the linear combination of the dependent variable and the contrast weights for each participant is needed. With these values a normal \(t\)-test against 0 is conducted. While you can do this manually, using cofad is quicker and it also gives you more information such as the different effect sizes.

Mixed Designs

The idea of mixed designs is a combination of between and within factors. In this case cofad calculates the L-Value for the within factor, which is treated as a new dependent variable. Thereafter, these used for the between analysis. Rosenthal et al. (2000), in their table 5.3, give an illustrative example: Nine children are measured four times (within), but they also belong to different groups of age (between).

There are two hypotheses:

  1. linear increase over time (within) (\(\lambda_\mathrm{1} = -3, \lambda_\mathrm{2} = -1, \lambda_\mathrm{3} = 1, \lambda_\mathrm{4} = 3\))
  2. linear increase over age (between) (\(\lambda_\mathrm{Age 8} = -1, \lambda_\mathrm{Age 10} = 0, \lambda_\mathrm{Age12} = 1\))

Let’s have a look at the data and calculation:

tab53 <- data.frame(
    Var = c(3, 1, 4, 4, 5, 5, 6, 5, 7, 2, 2, 5,
            5, 6, 7, 6, 6, 8, 3, 1, 5, 4, 5, 6,
            7, 6, 8, 3, 2, 5, 6, 6, 7, 8, 8, 9),
    age = as.factor(
      rep(rep(c("Age 8", "Age 10", "Age 12"), c(3, 3, 3)), 4)
      ),
    time = as.factor(rep(1:4, c(9, 9, 9, 9))),
    ID = as.factor(rep(1:9, 4 ))
    )
head(tab53)
#>   Var    age time ID
#> 1   3  Age 8    1  1
#> 2   1  Age 8    1  2
#> 3   4  Age 8    1  3
#> 4   4 Age 10    1  4
#> 5   5 Age 10    1  5
#> 6   5 Age 10    1  6
lambda_within <- c("1" = -3, "2" = -1, "3" = 1, "4" = 3)
lambda_between <-c("Age 8" = -1, "Age 10" = 0, "Age 12" = 1)

contr_mx <- calc_contrast(dv = Var, 
                          between = age,
                          lambda_between = lambda_between,
                          within = time,
                          lambda_within = lambda_within,
                          ID = ID, 
                          data = tab53
                          )
contr_mx
#> 
#> Contrast Analysis for Mixed-Design:
#> 
#> F(1,6) = 20.211; p = 0.004
#> Contrasts:  Age 10 = 0 Age 12 = 1 Age 8 = -1
#> r_effectsize = 0.871

The results look like a contrast analysis for between-subject designs. Again, the summary gives some more details: The effect sizes and within group means and standard errors of the L-values.

summary(contr_mx)
#> $F_Table
#>              SS df     MS      F     p
#> contrast 42.667  1 42.667 20.211 0.004
#> within   12.667  6  2.111     NA    NA
#> total    56.222  8     NA     NA    NA
#> 
#> $Effects
#>              effect
#> r_effectsize  0.871
#> r_contrast    0.878
#> r_alerting    0.990
#> 
#> $Within_Groups
#>               M        SE
#> Age 10 4.000000 1.0000000
#> Age 12 7.333333 0.8819171
#> Age 8  2.000000 0.5773503

This was the introduction to cofad. If you have any suggestions you can post it on our gitlab site: https://gitlab.hrz.tu-chemnitz.de/burma--tu-chemnitz.de/cofad.git

References

Furr, R. M. (2004). Interpreting effect sizes in contrast analysis. Understanding Statistics, 3(1), 1–25. https://doi.org/10.1207/s15328031us0301_1

Rosenthal, R., Rosnow, R. L., & Rubin, D. B. (2000). Contrasts and effect sizes in behavioral research: A correlational approach. Cambridge, England: Cambridge University Press.

Sedlmeier, P., & Renkewitz, F. (2018). Forschungsmethoden und Statistik für Psychologen und Sozialwissenschaftler (3., aktualisierte und erweiterte Auflage). Hallbergmoos, Germany: Pearson Studium.