Customizing incidence plots

This vignette provides some tips for the most common customisations of graphics produced by plot.incidence(). Our graphics use ggplot2, which is a distinct graphical system from base graphics. If you want advanced customisation of your incidence plots, we recommend following an introduction to ggplot2 such as R for Data Science.


Example data: simulated Ebola outbreak

This example uses the simulated Ebola Virus Disease (EVD) outbreak from the package outbreaks: ebola_sim_clean.

First, we load the data, compute the weekly incidence and initially group by gender and hospital:

library(outbreaks)
library(incidence2)
library(magrittr)

dat <- ebola_sim_clean$linelist
str(dat)
#> 'data.frame':    5829 obs. of  11 variables:
#>  $ case_id                : chr  "d1fafd" "53371b" "f5c3d8" "6c286a" ...
#>  $ generation             : int  0 1 1 2 2 0 3 3 2 3 ...
#>  $ date_of_infection      : Date, format: NA "2014-04-09" ...
#>  $ date_of_onset          : Date, format: "2014-04-07" "2014-04-15" ...
#>  $ date_of_hospitalisation: Date, format: "2014-04-17" "2014-04-20" ...
#>  $ date_of_outcome        : Date, format: "2014-04-19" NA ...
#>  $ outcome                : Factor w/ 2 levels "Death","Recover": NA NA 2 1 2 NA 2 1 2 1 ...
#>  $ gender                 : Factor w/ 2 levels "f","m": 1 2 1 1 1 1 1 1 2 2 ...
#>  $ hospital               : Factor w/ 5 levels "Connaught Hospital",..: 2 1 3 NA 3 NA 1 4 3 5 ...
#>  $ lon                    : num  -13.2 -13.2 -13.2 -13.2 -13.2 ...
#>  $ lat                    : num  8.47 8.46 8.48 8.46 8.45 ...

i <- incidence(dat, date_index = date_of_onset, interval = 7,
               groups = c(gender, hospital))
i
#> An incidence2 object: 601 x 4
#> 5829 cases from 2014-04-07 to 2015-05-03
#> interval: 7 days
#> cumulative: FALSE
#> 
#>    date_index              gender hospital                                 count
#>    <period>                <fct>  <fct>                                    <int>
#>  1 2014-04-07 to 2014-04-… f      Military Hospital                            1
#>  2 2014-04-14 to 2014-04-… m      Connaught Hospital                           1
#>  3 2014-04-21 to 2014-04-… f      <NA>                                         2
#>  4 2014-04-21 to 2014-04-… f      other                                        2
#>  5 2014-04-21 to 2014-04-… m      other                                        1
#>  6 2014-04-28 to 2014-05-… f      <NA>                                         1
#>  7 2014-04-28 to 2014-05-… f      Connaught Hospital                           1
#>  8 2014-04-28 to 2014-05-… f      Princess Christian Maternity Hospital (…     1
#>  9 2014-04-28 to 2014-05-… f      Rokupa Hospital                              1
#> 10 2014-05-05 to 2014-05-… f      <NA>                                         1
#> # … with 591 more rows

plot.incidence()

When calling plot on an incidence() object, the function plot.incidence() is implicitly used. To access its documentation, use ?plot.incidence. In this section, we illustrate existing customisations.

Default behaviour

By default, the function uses colors from the color palette vibrant(). If no fill is specified, groups will all be filled with the same color and a message will be displayed due to multiple groups being present:

i %>% plot()
#> plot() can only stack/dodge by one variable.
#> For multi-facet plotting try facet_plot()

Note, however, that the groups are still present just hidden by the default border coloring:

i %>% plot(color = "white")
#> plot() can only stack/dodge by one variable.
#> For multi-facet plotting try facet_plot()

Additional arguments

plot.incidence() is designed for a high level of customization without needing knowledge of ggplot2. To this end, there are multiple arguments that can be provided which have been set to sensible defaults. Some common changes you may wish to make are illustrated below (for a full description of arguments check out the accompanying help file ?plot.incidence).

We can easily fill the plot according to the groups present or a color of our choosing:

i %>% plot(fill = gender)
#> plot() can only stack/dodge by one variable.
#> For multi-facet plotting try facet_plot()

i %>% plot(fill = hospital, legend = "bottom")
#> plot() can only stack/dodge by one variable.
#> For multi-facet plotting try facet_plot()

i %>% regroup() %>% plot(fill = "red", color = "white")

Sometimes we may wish to change rotation of the x-axis and this can be done by passing an additional angle argument:

i %>% plot(angle = 45)
#> plot() can only stack/dodge by one variable.
#> For multi-facet plotting try facet_plot()

Applying the style of European Programme for Intervention Epidemiology Training (EPIET)

For small datasets it is convention of EPIET to display individual cases as rectangles. It can be done by doing two things: first, adding using the option show_cases = TRUE with a white border. We also add coord_equal = TRUE which forces each case to be a square.

i_epiet <- incidence(dat[160:180, ], date_index = date_of_onset)
i_epiet %>% plot(color = "white", show_cases = TRUE, angle = 45, size = 10, n_breaks = 20)

facet_plot()

With stacked plots it can be difficult to ascertain difference between groups. For this reason we provide users the ability to make faceted plots with the facet_plot() function. This function takes similar arguments to plot() but includes an additional facet argument:

i %>% facet_plot(facets = gender, n_breaks = 3)

i %>% facet_plot(facets = hospital, fill = gender, n_breaks = 3, nrow = 4)

i %>% regroup(gender) %>% facet_plot(facets = gender, fill = "grey", color = "white")

Miscellaneous

Changing colors

The default palette

A color palette is a function which outputs a specified number of colors. By default, the color used in incidence is called vibrant(). Its behaviour is different from usual palettes, in the sense that the first 6 colours are not interpolated:

par(mfrow = c(2, 1), mar = c(4,2,1,1))
barplot(1:6, col = vibrant(6))
barplot(1:20, col = vibrant(20))
#> Using more colors (20) than this palette can handle (6); some colors will be interpolated.
#> Consider using `muted` palette instead?

We also provide a second palette called muted() where the first 9 colours are not interpolated:

par(mfrow = c(2,1), mar = c(4,2,1,1))
barplot(1:9, col = muted(9))
barplot(1:20, col = muted(20))
#> Using more colors (20) than this palette can handle (9); some colors will be interpolated.

Using different palettes

Other color palettes can be provided via col_pal. Various palettes are part of the base R distribution, and many more are provided in additional packages. We provide a couple of examples:

i %>% regroup(hospital) %>% plot(fill = hospital, col_pal = rainbow, n_breaks = 3) # see ?rainbow

i %>% regroup(gender) %>% plot(fill = gender, col_pal = cm.colors)   # see ?cm.colors