1 Introduction

The locuszoomr package allows users to produce publication ready gene locus plots very similar to those produced by the web interface ‘locuszoom’ (http://locuszoom.org), but running purely locally in R. Plots can easily be customised, labelled and stacked.

These gene annotation plots are produced via R base graphics or ‘ggplot2’. A ‘plotly’ version can also be generated.

2 Installation

Bioconductor packages ensembldb and an Ensembl database installed either as a package or obtained through Bioconductor packages AnnotationHub are required before installation. To run the examples in this vignette the ‘EnsDb.Hsapiens.v75’ ensembl database package needs to be installed from Bioconductor.

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("ensembldb")
BiocManager::install("EnsDb.Hsapiens.v75")

Install from CRAN

install.packages("locuszoomr")

Install from Github

devtools::install_github("myles-lewis/locuszoomr")

locuszoomr can access the LDlinkR package to query 1000 Genomes for linkage disequilibrium (LD) across SNPs. In order to make use of this API function you will need a personal access token (see the LDlinkR vignette), available from the LDlink website https://ldlink.nih.gov/?tab=apiaccess.

Requests to LDlink are cached using the memoise package, to reduce API requests. This is helpful when modifying plots for aesthetic reasons.

3 Example locus plot

The quick example below uses a small subset (3 loci) of a GWAS dataset incorporated into the package as a demo. The dataset is from a genetic study on Systemic Lupus Erythematosus (SLE) by Bentham et al (2015). The full GWAS summary statistics can be downloaded from https://www.ebi.ac.uk/gwas/studies/GCST003156. The data format is shown below.

library(locuszoomr)
data(SLE_gwas_sub)  ## limited subset of data from SLE GWAS
head(SLE_gwas_sub)
##   chrom       pos        rsid other_allele effect_allele           p
## 1     2 191794580 rs193239665            A             T 0.000723856
## 2     2 191794978  rs72907256            C             T 0.000481744
## 3     2 191795546   rs6434429            C             G 0.156723000
## 4     2 191795869 rs148265823            A             G 0.606197000
## 5     2 191799600  rs60202309            T             G 0.100580000
## 6     2 191800180 rs114544034            T             C 0.022496800
##          beta         se   OR  OR_lower  OR_upper    r2
## 1  0.32930375 0.09741618 1.39 1.1483981 1.6824305 0.037
## 2  0.39877612 0.11423935 1.49 1.1910878 1.8639264 0.034
## 3 -0.09431068 0.06659515 0.91 0.7986462 1.0368796 0.004
## 4 -0.04082199 0.07918766 0.96 0.8219877 1.1211846 0.004
## 5  0.07696104 0.04686893 1.08 0.9852084 1.1839119 0.001
## 6 -0.16251893 0.07122170 0.85 0.7392542 0.9773364 0.019

We plot a locus from this dataset by extracting a subset of the data using the locus() function. Make sure you load the correct Ensembl database.

if (require(EnsDb.Hsapiens.v75)) {
loc <- locus(data = SLE_gwas_sub, gene = 'UBE2L3', flank = 1e5,
             ens_db = "EnsDb.Hsapiens.v75")
summary(loc)
locus_plot(loc)
}
## Gene UBE2L3 
## Chromosome 22, position 21,803,736 to 22,078,323
## 514 SNPs/datapoints
## 19 gene transcripts
## 8 protein_coding, 3 snoRNA, 2 lincRNA, 2 miRNA, 2 misc_RNA, 1 pseudogene, 1 sense_intronic