---
title: "Introduction to spatsoc"
author: "Alec L. Robitaille, Quinn Webber and Eric Vander Wal"
output: 
  rmarkdown::html_vignette:
    number_sections: true
    toc: true
vignette: >
  %\VignetteIndexEntry{Introduction to spatsoc}
  %\VignetteEngine{knitr::knitr}
  %\VignetteEncoding{UTF-8}
---

```{r knitropts, include = FALSE}
knitr::opts_chunk$set(message = TRUE, 
                      warning = FALSE,
                      eval = FALSE, 
                      echo = FALSE)
```

The `spatsoc` package provides functionality for analyzing animal relocation data in time and space to identify potential interactions among individuals and build gambit-of-the-group data for constructing social networks. 

The package contains grouping and edge-list generating functions that are used for identifying spatially and temporally explicit groups from input data. In addition, we provide social network analysis functions for randomizing individual identifiers within groups, designed to test whether social networks generated from animal relocation data were based on non-random social proximity among individuals and for generating group by individual matrices.

The functions were developed for application across animal relocation data, for example, proximity based social network analyses and spatial and temporal clustering of points.

---

See the other vignettes for further information:

- [Introduction to spatsoc](https://docs.ropensci.org/spatsoc/articles/intro.html)
  - temporal grouping
  - spatiotemporal grouping with `group_pts`, `group_lines`, `group_polys`
  - distance based edge-list generation with `edge_dist`
- [Frequently asked questions about spatsoc](https://docs.ropensci.org/spatsoc/articles/faq.html)
  - install
  - function details for `group_times`, `group_pts`, `group_lines`, `group_polys`,
  `edge_dist`, `edge_nn`, and `randomizations`
  - package design including modify-by-reference, data.table column allocation
  - calculating summary information
- [Using spatsoc in social network analysis](https://docs.ropensci.org/spatsoc/articles/using-in-sna.html)
  - generating gambit-of-the-group data
  - generating observed networks
  - data stream randomization, randomized networks
  - network metrics
- [Using distance based edge-lists generating functions, dyad_id, and fusion_id](https://docs.ropensci.org/spatsoc/articles/using-edge-and-dyad.html)
  - generate distance based edge-lists with `edge_dist` and `edge_nn`
  - generate dyad identifiers for edge-lists with `dyad_id`
  - identify fusion events with `fusion_id`
- [Geometry interface](https://docs.ropensci.org/spatsoc/articles/geometry-interface-and-spatial-measures.html)
  - using `get_geometry` to setup a geometry column and use the geometry interface
  - details of underlying distance, direction and centroid spatial measures
  - converting to and from related packages
- [Interspecific interactions](https://docs.ropensci.org/spatsoc/articles/interspecific-interactions.html)
  - combine two movement datasets
  - identify interspecific interactions

## Data preparation

```{r, eval = TRUE, echo = TRUE}
# Load packages
library(spatsoc)
library(data.table)
```

```{r, echo = FALSE, eval = TRUE}
data.table::setDTthreads(1)
```

```{r, eval = TRUE, echo = TRUE}
# Read in spatsoc's example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

# Use subset of individuals
DT <- DT[ID %in% c('H', 'I', 'J')]

# Cast character column 'datetime' as POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
DT <- DT[ID %chin% c('H', 'I', 'J')]
```

```{r, eval = TRUE, echo = FALSE}
knitr::kable(DT[, .SD[1:2], ID][order(ID)])
```


`spatsoc` expects a `data.table` for all of its functions. If you have a
`data.frame`, you can use `data.table::setDT()` to convert it by reference. If
your data is a CSV, you can use `data.table::fread()` to import it as a
`data.table`.

The data consist of relocations of `r DT[, uniqueN(ID)]` individuals over `r
DT[, max(yday(datetime)) - min(yday(datetime))]` days. Using these data, we can
compare the various grouping methods available in `spatsoc`. Note: these
examples will use a subset of the data, only individuals H, I and J.

## Temporal grouping
The `group_times` function is used to group relocations temporally. It is
flexible to a threshold provided in units of minutes, hours or days. Since GPS
fixes taken at regular intervals have some level of variability, we will provide
a time threshold (`threshold`), to consider all fixes within this threshold
taken at the same time. Alternatively, we may want to understand different
scales of grouping, perhaps daily movement trajectories or seasonal home range
overlap.

```{r groupmins, echo = TRUE}
group_times(DT, datetime = 'datetime', threshold = '5 minutes')
```

```{r tableSetUp, eval = TRUE}
nRows <- 9
```

```{r tabgroupmins, eval = TRUE}
knitr::kable(
  group_times(DT, threshold = '5 minutes', datetime = 'datetime')[, 
    .(ID, X, Y, datetime, minutes, timegroup)][
      order(datetime)][
        1:nRows])
```


A message is returned when `group_times` is run again on the same `DT`, as the columns already exist in the input `DT` and will be overwritten. 

```{r grouphours, echo = TRUE}
group_times(DT, datetime = 'datetime', threshold = '2 hours')
```

```{r tabgrouphours, eval = TRUE}
knitr::kable(
  group_times(DT, threshold = '2 hours', datetime = 'datetime')[, 
    .(ID, X, Y, datetime, hours, timegroup)][
      order(datetime)][
        1:nRows])
```

```{r groupdays, echo = TRUE}
group_times(DT, datetime = 'datetime', threshold = '5 days')
```

```{r tabgroupdays, eval = TRUE}
knitr::kable(
  group_times(
    DT, 
    threshold = '5 days', 
    datetime = 'datetime')[,
      .SD[sample(.N, 3)], by = .(timegroup, block)][order(datetime)][
        1:nRows, .(ID, X, Y, datetime, block, timegroup)]
)
```

## Spatial grouping
The `group_pts` function compares the relocations of all individuals in each
timegroup and groups individuals based on a distance threshold provided by the
user. The `group_pts` function uses the "chain rule" where three or more
individuals that are all within the defined threshold distance of at least one
other individual are considered in the same group. For point based spatial
grouping with a distance threshold that does not use the chain rule, see
`edge_dist` below.

```{r grouppts, echo = TRUE}
group_times(DT = DT, datetime = 'datetime', threshold = '15 minutes')
group_pts(DT, threshold = 50, id = 'ID', coords = c('X', 'Y'), timegroup = 'timegroup')
```

```{r fakegrouppts, eval = TRUE}
DT <- group_times(DT = DT, datetime = 'datetime', 
                     threshold = '15 minutes')
DT <- group_pts(
    DT = DT,
    threshold = 50, id = 'ID', coords = c('X', 'Y'),
    timegroup = 'timegroup')

knitr::kable(
  DT[
      between(group,  771, 774)][order(timegroup)][
        1:nRows, .(ID, X, Y, timegroup, group)]
)
```


The `group_lines` function groups individuals whose trajectories intersect in a
specified time interval. This represents a coarser grouping method than
`group_pts` which can help understand shared space at daily, weekly or other
temporal resolutions.

```{r fakegrouplines, echo = TRUE}
utm <- 32736

group_times(DT = DT, datetime = 'datetime', threshold = '1 day')
group_lines(DT, threshold = 50, crs = utm, 
            id = 'ID', coords = c('X', 'Y'),
            timegroup = 'timegroup', sortBy = 'datetime')
```

```{r grouplines, eval = TRUE}
utm <- 32736

DT <- group_times(DT = DT, datetime = 'datetime', 
                threshold = '1 day')
DT <- group_lines(DT,
                  threshold = 50, crs = utm, 
                  id = 'ID', coords = c('X', 'Y'), 
                  sortBy = 'datetime', timegroup = 'timegroup')
knitr::kable(
  unique(DT[, .(ID, timegroup, group)])[, .SD[1:3], ID][order(timegroup)]
)
```


The `group_polys` function groups individuals whose home ranges intersect. This
represents the coarsest grouping method, to provide a measure of overlap across
seasons, years or all available relocations. It can either return the proportion
of home range area overlapping between individuals or simple groups. Home ranges
are calculated using `adehabitatHR::kernelUD` or `adehabitatHR::mcp`.
Alternatively, a simple features POLYGON or MULTIPOLYGON object can be provided
to the `sfPolys` argument along with an id column.

```{r fakegrouppolys, echo = TRUE}
utm <- 32736
group_times(DT = DT, datetime = 'datetime', threshold = '8 days')
group_polys(DT = DT, area = TRUE, hrType = 'mcp',
           hrParams = list('percent' = 95),
           crs = utm,
           coords = c('X', 'Y'), id = 'ID')
```

```{r grouppolys, eval = TRUE}
utm <- 32736
DT <- group_times(DT = DT, datetime = 'datetime', threshold = '8 days')
knitr::kable(
  data.frame(group_polys(
    DT, 
    area = TRUE, hrType = 'mcp',
           hrParams = list('percent' = 95),
           crs = utm,
           coords = c('X', 'Y'), id = 'ID')[
             , .(ID1, ID2, area, proportion)])
)
```

## Edge-list generation
The `edge_dist` function calculates the geographic distance between between
individuals within each timegroup and returns all paired relocations within the
spatial threshold. `edge_dist` uses a distance matrix like group_pts, but, in
contrast, does not use the chain rule to group relocations.

```{r edgedist, echo = TRUE}
group_times(DT = DT, datetime = 'datetime', threshold = '15 minutes')
edge_dist(DT, threshold = 50, id = 'ID', 
          coords = c('X', 'Y'), timegroup = 'timegroup', fillNA = TRUE)
```

```{r fakeedgedist, eval = TRUE}
DT <- group_times(DT = DT, datetime = 'datetime', 
                     threshold = '15 minutes')
edges <- edge_dist(DT, threshold = 50, id = 'ID', 
                   coords = c('X', 'Y'), timegroup = 'timegroup', fillNA = TRUE)

knitr::kable(
  edges[between(timegroup, 158, 160)]
)
```

The `edge_nn` function calculates the nearest neighbour to each individual
within each time group. If the optional distance threshold is provided, it is
used to limit the maximum distance between neighbours. `edge_nn` returns an edge
list of each individual and their nearest neighbour.

```{r edgenn, echo = TRUE}
group_times(DT = DT, datetime = 'datetime', threshold = '15 minutes')
edge_nn(DT, id = 'ID', coords = c('X', 'Y'), timegroup = 'timegroup')
```

```{r fakeedgenn, eval = TRUE}
DT <- group_times(DT = DT, datetime = 'datetime', 
                     threshold = '15 minutes')
edges <- edge_nn(DT, id = 'ID', coords = c('X', 'Y'), timegroup = 'timegroup')

knitr::kable(
  edges[1:6]
)
```

