Useful Info

Anna Hutchinson

This vignette contains supplementary information regarding the usage of the corrcoverage R package.


Key functions

The two main functions are:

  1. corrcov (or analogously corrcov_bhat): Provides a corrected coverage estimate of credible sets obtained using the Bayesian approach for fine-mapping (see “Corrected Coverage” vignette).
  1. corrected_cs (or analogously corrected_cs_bhat): Finds a corrected credible set, which is the smallest set of variants such that the corrected coverage is above some user defined “desired coverage” (see "Corrected credible set vignette).

Additional useful functions


Conversion functions

The Bayesian method for fine-mapping involves finding the posterior probability of causality for each SNP, before sorting these into descending order and adding variants to a ‘credible set’ until the combined posterior probabilities of these SNPs exceed some threshold. The supplementary text of Maller’s paper (available here) shows that these posterior probabilities are normalised Bayes factors.

Asymptotic Bayes factors (Wakefield, 2009) are commonly used in genetic association studies as these only require the specification of \(Z\)-scores (or equivalently the effect size coefficients, \(\beta\), and their standard errors, \(V\)), the standard errors of the effect sizes (\(V\)) and the prior variance of the estimated effect size (\(W^2\)), thus only requiring summary data from genetic association studies plus an estimate for the \(W\) parameter.

Consequently, the corrcoverage package contains functions for converting between \(P\)-values, \(Z\)-scores, asymptotic Bayes factors (ABFs) and posterior probabilities of causality (PPs). The following table shows what input these conversion functions require and what output they produce. The ‘include null model’ column is for whether the null model of no genetic effect is included in the calculation (PPs obtained using the standard Bayesian approach ignore this).

Function Include null model? Input Output
approx.bf.p YES \(P\)-values log(ABF)
pvals_pp YES \(P\)-values Posterior Probabilities
z0_pp YES Marginal \(Z\)-scores Posterior Probabilities
ppfunc NO Marginal \(Z\)-scores Posterior Probabilities

Marginal and joint Z scores

Functions are also provided to simulate marginal \(Z\)-scores from joint \(Z\)-scores (\(Z_j\)). The joint \(Z\)-scores are all 0, except at the causal variant where it is the “true effect”, \(\mu\).

\(\mu\) can be estimated using the est_mu function which requires sample sizes, marginal \(Z\)-scores and minor allele frequencies. We estimate \(\mu\) by \[\hat\mu=\sum_{j}|Z_j|\times PP_j\]

The z_sim function simulates marginal \(Z\)-scores from joint \(Z\)-scores, whilst zj_pp can be used to simulate posterior probability systems from a joint \(Z\)-score vector. These functions first calculate the expected marginal \(Z\) scores, \(E(Z)\), \[E(Z)=Z_j \times \Sigma\] where \(\Sigma\) is the correlation matrix between SNPs.

We can then simulate more \(Z\)-score systems from a multivariate normal distribution with mean \(E(Z)\) and variance \(\Sigma\). This is a key step in our corrected coverage method.

Summary

\[E(Z)=Z_j \times \Sigma\]

\[Z \sim MVN(E(Z),\Sigma)\]