Handling incidence objects

We try to make incidence() objects easy to work with providing helper functions for both manipulating and accessing data within the objects, and integration with tidyverse verbs.

Modifying incidence objects

regroup

Sometimes you may find you’ve created a grouped incidence() but now want to change the internal grouping. Assuming you are after a subset of the grouping already generated, then you can use to regroup() function to get the desired aggregation:

library(outbreaks)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(incidence2)

# load data
dat <- ebola_sim_clean$linelist

# generate the incidence object with 3 groups
inci <- incidence(dat, date_index = date_of_onset,
                  groups = c(gender, hospital, outcome),
                  interval = "week")
inci
#> An incidence2 object: 1,448 x 5
#> 5829 cases from 2014-W15 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index gender hospital                                     outcome count
#>    <yrwk>     <fct>  <fct>                                        <fct>   <int>
#>  1 2014-W15   f      Military Hospital                            <NA>        1
#>  2 2014-W16   m      Connaught Hospital                           <NA>        1
#>  3 2014-W17   f      <NA>                                         <NA>        1
#>  4 2014-W17   f      <NA>                                         Death       1
#>  5 2014-W17   f      other                                        Recover     2
#>  6 2014-W17   m      other                                        Recover     1
#>  7 2014-W18   f      <NA>                                         Recover     1
#>  8 2014-W18   f      Connaught Hospital                           Recover     1
#>  9 2014-W18   f      Princess Christian Maternity Hospital (PCMH) Death       1
#> 10 2014-W18   f      Rokupa Hospital                              Recover     1
#> # … with 1,438 more rows

# regroup to just two groups
inci %>% regroup(c(gender, outcome))
#> An incidence2 object: 320 x 4
#> 5829 cases from 2014-W15 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index gender outcome count
#>    <yrwk>     <fct>  <fct>   <int>
#>  1 2014-W15   f      <NA>        1
#>  2 2014-W16   m      <NA>        1
#>  3 2014-W17   f      <NA>        1
#>  4 2014-W17   f      Death       1
#>  5 2014-W17   f      Recover     2
#>  6 2014-W17   m      Recover     1
#>  7 2014-W18   f      Death       1
#>  8 2014-W18   f      Recover     3
#>  9 2014-W19   f      <NA>        4
#> 10 2014-W19   f      Death       2
#> # … with 310 more rows

# drop all groups
inci %>% regroup()
#> An incidence2 object: 56 x 2
#> 5829 cases from 2014-W15 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index count
#>    <yrwk>     <int>
#>  1 2014-W15       1
#>  2 2014-W16       1
#>  3 2014-W17       5
#>  4 2014-W18       4
#>  5 2014-W19      12
#>  6 2014-W20      17
#>  7 2014-W21      15
#>  8 2014-W22      19
#>  9 2014-W23      23
#> 10 2014-W24      21
#> # … with 46 more rows

cumulate

We also provide a helper function, cumulate() to easily generate cumulative incidences:

inci %>% 
  regroup(hospital) %>% 
  cumulate() %>% 
  facet_plot(n_breaks = 4)

keep_first and keep_last

Once your data is grouped by date, you may want to select the first or last few entries based on a particular date grouping using keep_first() and keep_last():

inci %>% keep_first(3)
#> An incidence2 object: 6 x 5
#> 7 cases from 2014-W15 to 2014-W17
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>   date_index gender hospital           outcome count
#>   <yrwk>     <fct>  <fct>              <fct>   <int>
#> 1 2014-W15   f      Military Hospital  <NA>        1
#> 2 2014-W16   m      Connaught Hospital <NA>        1
#> 3 2014-W17   f      <NA>               <NA>        1
#> 4 2014-W17   f      <NA>               Death       1
#> 5 2014-W17   f      other              Recover     2
#> 6 2014-W17   m      other              Recover     1

inci %>% keep_last(3)
#> An incidence2 object: 63 x 5
#> 103 cases from 2015-W16 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index gender hospital           outcome count
#>    <yrwk>     <fct>  <fct>              <fct>   <int>
#>  1 2015-W16   f      <NA>               <NA>        1
#>  2 2015-W16   f      <NA>               Death       7
#>  3 2015-W16   f      <NA>               Recover     1
#>  4 2015-W16   f      Connaught Hospital <NA>        1
#>  5 2015-W16   f      Connaught Hospital Death       5
#>  6 2015-W16   f      Connaught Hospital Recover     3
#>  7 2015-W16   f      Military Hospital  Recover     1
#>  8 2015-W16   f      other              <NA>        1
#>  9 2015-W16   f      other              Death       2
#> 10 2015-W16   f      other              Recover     1
#> # … with 53 more rows

Tidyverse compatibility

incidence2 has been written with tidyverse compatibility (in particular dplyr) at the forefront of the design choices we have made. By this we mean that if an operation from dplyr is applied to an incidence() object then as long as the invariants of the object are preserved (i.e. groups, interval and uniqueness of rows) then the object returned will be an incidence() object. If the invariants are not preserved then a tibble will be returned instead. Some examples of these behaviours are given below:

library(dplyr)

# create incidence object
inci <-
  dat %>%
  incidence(
    date_index = date_of_onset,
    interval = "week",
    groups = c(hospital, gender)
  )

# filtering preserves class
inci %>%  filter(gender == "f", hospital == "Rokupa Hospital")
#> An incidence2 object: 48 x 4
#> 210 cases from 2014-W18 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index hospital        gender count
#>    <yrwk>     <fct>           <fct>  <int>
#>  1 2014-W18   Rokupa Hospital f          1
#>  2 2014-W20   Rokupa Hospital f          1
#>  3 2014-W22   Rokupa Hospital f          1
#>  4 2014-W23   Rokupa Hospital f          1
#>  5 2014-W25   Rokupa Hospital f          1
#>  6 2014-W27   Rokupa Hospital f          1
#>  7 2014-W28   Rokupa Hospital f          4
#>  8 2014-W29   Rokupa Hospital f          2
#>  9 2014-W30   Rokupa Hospital f          1
#> 10 2014-W31   Rokupa Hospital f          1
#> # … with 38 more rows

# slice operations preserve class
inci %>% slice_sample(n = 10)
#> An incidence2 object: 10 x 4
#> 99 cases from 2014-W25 to 2015-W12
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index hospital                                     gender count
#>    <yrwk>     <fct>                                        <fct>  <int>
#>  1 2014-W30   Military Hospital                            m          1
#>  2 2014-W30   Princess Christian Maternity Hospital (PCMH) f          4
#>  3 2014-W46   Connaught Hospital                           f         29
#>  4 2015-W12   other                                        f          5
#>  5 2015-W08   <NA>                                         m          9
#>  6 2014-W49   Connaught Hospital                           m         25
#>  7 2015-W07   Rokupa Hospital                              m          4
#>  8 2014-W34   <NA>                                         f         13
#>  9 2014-W25   other                                        m          2
#> 10 2014-W28   <NA>                                         m          7

inci %>%  slice(1, 5, 10)
#> An incidence2 object: 3 x 4
#> 3 cases from 2014-W15 to 2014-W19
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>   date_index hospital          gender count
#>   <yrwk>     <fct>             <fct>  <int>
#> 1 2014-W15   Military Hospital f          1
#> 2 2014-W17   other             m          1
#> 3 2014-W19   <NA>              f          1

# mutate preserve class
inci %>%  mutate(future = date_index + 999)
#> An incidence2 object: 601 x 5
#> 5829 cases from 2014-W15 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index hospital                                     gender count future  
#>    <yrwk>     <fct>                                        <fct>  <int> <yrwk>  
#>  1 2014-W15   Military Hospital                            f          1 2033-W22
#>  2 2014-W16   Connaught Hospital                           m          1 2033-W23
#>  3 2014-W17   <NA>                                         f          2 2033-W24
#>  4 2014-W17   other                                        f          2 2033-W24
#>  5 2014-W17   other                                        m          1 2033-W24
#>  6 2014-W18   <NA>                                         f          1 2033-W25
#>  7 2014-W18   Connaught Hospital                           f          1 2033-W25
#>  8 2014-W18   Princess Christian Maternity Hospital (PCMH) f          1 2033-W25
#>  9 2014-W18   Rokupa Hospital                              f          1 2033-W25
#> 10 2014-W19   <NA>                                         f          1 2033-W26
#> # … with 591 more rows

# rename preserve class
inci %>%  rename(left_bin = date_index)
#> An incidence2 object: 601 x 4
#> 5829 cases from 2014-W15 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    left_bin hospital                                     gender count
#>    <yrwk>   <fct>                                        <fct>  <int>
#>  1 2014-W15 Military Hospital                            f          1
#>  2 2014-W16 Connaught Hospital                           m          1
#>  3 2014-W17 <NA>                                         f          2
#>  4 2014-W17 other                                        f          2
#>  5 2014-W17 other                                        m          1
#>  6 2014-W18 <NA>                                         f          1
#>  7 2014-W18 Connaught Hospital                           f          1
#>  8 2014-W18 Princess Christian Maternity Hospital (PCMH) f          1
#>  9 2014-W18 Rokupa Hospital                              f          1
#> 10 2014-W19 <NA>                                         f          1
#> # … with 591 more rows


# select returns a tibble unless all date, count and group variables are preserved
inci %>% select(-1)
#> # A tibble: 601 x 3
#>    hospital                                     gender count
#>    <fct>                                        <fct>  <int>
#>  1 Military Hospital                            f          1
#>  2 Connaught Hospital                           m          1
#>  3 <NA>                                         f          2
#>  4 other                                        f          2
#>  5 other                                        m          1
#>  6 <NA>                                         f          1
#>  7 Connaught Hospital                           f          1
#>  8 Princess Christian Maternity Hospital (PCMH) f          1
#>  9 Rokupa Hospital                              f          1
#> 10 <NA>                                         f          1
#> # … with 591 more rows

inci %>% select(everything())
#> An incidence2 object: 601 x 4
#> 5829 cases from 2014-W15 to 2015-W18
#> interval: 1 monday week
#> cumulative: FALSE
#> 
#>    date_index hospital                                     gender count
#>    <yrwk>     <fct>                                        <fct>  <int>
#>  1 2014-W15   Military Hospital                            f          1
#>  2 2014-W16   Connaught Hospital                           m          1
#>  3 2014-W17   <NA>                                         f          2
#>  4 2014-W17   other                                        f          2
#>  5 2014-W17   other                                        m          1
#>  6 2014-W18   <NA>                                         f          1
#>  7 2014-W18   Connaught Hospital                           f          1
#>  8 2014-W18   Princess Christian Maternity Hospital (PCMH) f          1
#>  9 2014-W18   Rokupa Hospital                              f          1
#> 10 2014-W19   <NA>                                         f          1
#> # … with 591 more rows

Accessing variable information

We provide multiple accessors to easily access information about an incidence() objects structure: