```
library(MASS)
library(kableExtra)
```

In the previous vignette mash
common baseline, we estimate the change in some quantity computed in
multiple conditions over a **common control** condition.
However, there might be **no common control** condition in
a study. In this case, we define the reference condition as the
**mean** over different conditions. Deviation in any
condition is defined as a difference in the quantity over the mean. We
want to estimate the change in some quantity computed in multiple
conditions over their mean.

For example, we measure the gene expression under multiple conditions. We want to estimate the change in expression in multiple conditions over their mean.

As in the mash common baseline vignette, we include the additional
burden of comparing all conditions to the same reference condition. To
deal with these additional correlations, mashr allows the user to
specify the reference condition using `mash_update_data`

with
`ref = 'mean'`

, after setting up the data in
`mash_set_data`

.

```
= function(n, p, V, Utrue, err_sd=0.01, pi=NULL){
generate_data if (is.null(pi)) {
= rep(1, length(Utrue)) # default to uniform distribution
pi
}::are_equal(length(pi), length(Utrue))
assertthat
for (j in 1:length(Utrue)) {
::are_equal(dim(Utrue[j]), c(p, p))
assertthat
}
<- pi / sum(pi) # normalize pi to sum to one
pi <- sample(1:length(pi), n, replace=TRUE, prob=pi)
which_U
= matrix(0, nrow=n, ncol=p)
Beta for(i in 1:n){
= mvrnorm(1, rep(0, p), Utrue[[which_U[i]]])
Beta[i,]
}= matrix(err_sd, nrow=n, ncol=p, byrow = TRUE)
Shat = mvrnorm(n, rep(0, p), Shat[1,]^2 * V)
E = Beta + E
Bhat return(list(B = Beta, Bhat=Bhat, Shat = Shat, whichU = which_U))
}
```

Here we simulate data for illustration. This simulation routine creates a dataset with 5 conditions and 2000 samples. Half of the samples have equal expression among conditions. In the rest samples, half have higher and equal expression in the first 2 conditions, half have higher expression in the last condition.

```
set.seed(1)
= 2000
n = 5
R = diag(R)
V = matrix(0, R, R)
U0 = matrix(1, R, R)
U1 = U0; U2[1:2,1:2] = 4
U2 = U0; U3[5,5] = 4
U3 = generate_data(n, R, V, list(U0=U0, U1=U1, U2=U2, U3 = U3), err_sd = 1) simdata
```

- Read in the data, and set the reference condition as mean

```
library(mashr)
= mash_set_data(simdata$Bhat, simdata$Shat)
data
= mash_update_data(data, ref = 'mean') data.L
```

The updated mash data object (data.L) includes the induced correlation internally.

- We proceed the analysis using the simple canonical covariances as in the initial introductory vignette, and the data driven covariances as in the Introduction to mash: data-driven covariances.

- Canonical covariances

`= cov_canonical(data.L) U.c `

- Data-driven covariances

```
.1by1 = mash_1by1(data.L)
m= get_significant_results(m.1by1)
strong = cov_pca(data.L,2,subset=strong)
U.pca = cov_ed(data.L, U.pca, subset=strong) U.ed
```

- Fit mash model

`= mash(data.L, c(U.c,U.ed), algorithm.version = 'R') m `

```
# - Computing 2000 x 181 likelihood matrix.
# - Likelihood calculations took 0.10 seconds.
# - Fitting model with 181 mixture components.
# - Model fitting took 0.92 seconds.
# - Computing posterior matrices.
# - Computation allocated took 0.02 seconds.
```

The log likelihood is

`print(get_loglik(m),digits=10)`

`# [1] -10893.2688`

Use `get_significant_results`

to find the indices of
effects that are â€˜significantâ€™:

`length(get_significant_results(m))`

`# [1] 139`

The number of false positive is 1.