---
title: "04 Credible intervals for transition probabilities: Cypripedium calceolus"
author: "Raymond Tremblay"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{04 Credible intervals for transition probabilities: Cypripedium calceolus}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  fig.width  = 7,
  fig.height = 5
)
```

## Introduction

This vignette demonstrates how to use `transition_CrI()` and
`plot_transition_CrI()` to compute and visualise Bayesian credible intervals
for transition probabilities, and how the choice of prior weight influences
the posterior estimates.

We use a published transition matrix for the lady's slipper orchid
*Cypripedium calceolus*, a long-lived, clonal species of conservation concern
in Europe. The matrix was extracted from the COMPADRE Plant Matrix Database
(MatrixID 242623; Salguero-Gómez et al. 2015), originally published in
Shefferson et al. (2001, *Conservation Biology*,
DOI: 10.1111/j.1523-1739.2010.01466.x).

We use **matU** — the survival-and-growth sub-matrix — which contains only
transitions between living stages. This is the appropriate input for
`raretrans`, which models transition probabilities using a
Dirichlet-multinomial model.

## The *Cypripedium calceolus* transition matrix (matU)

The population is structured into six stages:
**Dormant**, **Smallest**, **Small**, **Intermediate**, **Large**, and
**Extra Large**.
```{r data}
library(raretrans)

stage_names <- c("Dormant", "Smallest", "Small",
                 "Intermediate", "Large", "Extra Large")

# matU: survival and growth transitions only (matA = matU + matF + matC)
# Source: COMPADRE MatrixID 242623
# Shefferson et al. (2001) Conservation Biology
# DOI: 10.1111/j.1523-1739.2010.01466.x
matU <- matrix(
  c(0.78, 0.00, 0.00, 0.00, 0.00, 0.00,
    0.06, 0.42, 0.03, 0.00, 0.00, 0.00,
    0.00, 0.24, 0.62, 0.05, 0.00, 0.00,
    0.00, 0.00, 0.21, 0.73, 0.06, 0.00,
    0.00, 0.00, 0.00, 0.12, 0.74, 0.07,
    0.00, 0.00, 0.00, 0.00, 0.11, 0.83),
  nrow = 6, ncol = 6, byrow = TRUE,
  dimnames = list(stage_names, stage_names)
)

# Construct TF list (no fecundity in matU so F is all zeros)
F_mat <- matrix(0, nrow = 6, ncol = 6,
                dimnames = list(stage_names, stage_names))
TF <- list(T = matU, F = F_mat)

# Observed stage distribution (number of individuals per stage)
N <- c(15, 12, 28, 34, 22, 10)
names(N) <- stage_names

matU
```

Note that the columns of `matU` do not sum to 1 — the remainder represents
individuals that died during the census interval and is handled internally
by `raretrans` as the implicit "dead" fate.

## Computing credible intervals

`transition_CrI()` computes the marginal posterior beta credible interval
for every entry of the transition matrix, including the probability of dying.
By default it uses a uniform (uninformative) Dirichlet prior.
```{r cri_default}
cri_uniform <- transition_CrI(TF, N, stage_names = stage_names)
head(cri_uniform, 10)
```

Each row gives the posterior **mean** transition probability and its
**lower** and **upper** 95% credible interval bounds.

## Visualising with `plot_transition_CrI()`

### Including the dead fate (default)
```{r plot_with_dead, fig.cap = "Posterior transition probabilities with 95% credible intervals for all fates including mortality."}
plot_transition_CrI(cri_uniform,
                    title = "Cypripedium calceolus — uniform prior")
```

Each panel shows the fate distribution from one source stage. Points are
posterior means; vertical bars are 95% credible intervals. Wide intervals
indicate stages with few observed individuals.

### Excluding the dead fate
```{r plot_no_dead, fig.cap = "Posterior transition probabilities excluding the dead fate."}
plot_transition_CrI(cri_uniform,
                    include_dead = FALSE,
                    title = "Cypripedium calceolus — transitions only")
```

Excluding the dead fate focuses attention on survival transitions between
living stages, which is often more informative for life-history comparisons.

## Effect of prior weight

The `priorweight` argument controls how much the prior pulls estimates
toward equal probabilities. A value of `-1` (the default) uses a
minimally informative prior. Positive values express the prior weight as
a percentage of the observed sample size — for example, `priorweight = 50`
means the prior contributes half as many pseudo-observations as the data.

This matters most for rare stages with few observed individuals, where the
prior can have a strong regularising effect.
```{r prior_comparison}
# Uninformative prior (default)
cri_uninf <- transition_CrI(TF, N,
                             priorweight  = -1,
                             stage_names  = stage_names)

# Weakly informative prior (25% of sample size)
cri_weak  <- transition_CrI(TF, N,
                             priorweight  = 25,
                             stage_names  = stage_names)

# Strongly informative prior (100% of sample size)
cri_strong <- transition_CrI(TF, N,
                              priorweight = 100,
                              stage_names = stage_names)

# Compare interval widths for the Dormant stage
comp <- data.frame(
  prior      = c("Uninformative", "Weak (25%)", "Strong (100%)"),
  mean_width = c(
    mean(cri_uninf[cri_uninf$from_stage  == "Dormant", "upper"] -
         cri_uninf[cri_uninf$from_stage  == "Dormant", "lower"]),
    mean(cri_weak[cri_weak$from_stage    == "Dormant", "upper"] -
         cri_weak[cri_weak$from_stage    == "Dormant", "lower"]),
    mean(cri_strong[cri_strong$from_stage == "Dormant", "upper"] -
         cri_strong[cri_strong$from_stage == "Dormant", "lower"])
  )
)
comp
```
```{r plot_prior_comparison, fig.height = 10, fig.cap = "Effect of prior weight on credible interval width. Stronger priors narrow the intervals and pull means toward equal transition probabilities."}
library(ggplot2)

cri_uninf$prior  <- "Uninformative"
cri_weak$prior   <- "Weak (25%)"
cri_strong$prior <- "Strong (100%)"

cri_all <- rbind(cri_uninf, cri_weak, cri_strong)
cri_all$prior <- factor(cri_all$prior,
                        levels = c("Uninformative",
                                   "Weak (25%)",
                                   "Strong (100%)"))

ggplot(cri_all,
       aes(x = to_stage, y = mean, ymin = lower, ymax = upper,
           colour = prior)) +
  geom_pointrange(position = position_dodge(width = 0.5)) +
  facet_wrap(~from_stage, scales = "free_x") +
  scale_y_continuous(limits = c(0, 1)) +
  scale_colour_manual(values = c("Uninformative" = "grey40",
                                 "Weak (25%)"    = "steelblue",
                                 "Strong (100%)" = "firebrick")) +
  labs(x      = "Destination stage",
       y      = "Transition probability",
       colour = "Prior weight",
       title  = "Effect of prior weight on credible intervals",
       subtitle = "Cypripedium calceolus — matU") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```

A stronger prior (red) shrinks credible intervals and pulls posterior means
toward equal probabilities. For well-sampled stages (e.g. Intermediate,
Large) the effect is small. For rare stages (e.g. Dormant, Extra Large)
the prior has a more pronounced influence — a good reason to think
carefully about prior choice when sample sizes are small.

## Visualising full posterior densities

`plot_transition_density()` shows the complete marginal posterior beta
distribution for every transition, arranged as an (n+1) x n grid mirroring
the structure of the projection matrix. Columns are source stages (from) and
rows are destination stages (to), with the dead fate as the bottom row.
The shaded region shows the 95% credible interval.

### With uninformative prior

```{r density_uninf, fig.width = 9, fig.height = 8, fig.cap = "Full posterior beta densities for all transitions with uninformative prior. Shaded region = 95% credible interval."}
plot_transition_density(TF, N,
                        stage_names  = stage_names,
                        title        = "Cypripedium calceolus — uninformative prior")
```

Wide, flat densities indicate high uncertainty (few observations).
Narrow, peaked densities indicate well-estimated transitions.
Panels where the probability is near zero show a density spike at 0 —
this is expected behaviour for impossible transitions.

### Effect of prior weight on densities

A stronger prior pulls densities toward the centre and narrows them,
particularly for rare stages.

```{r density_strong, fig.width = 9, fig.height = 8, fig.cap = "Posterior beta densities with a strong prior (100% of sample size)."}
plot_transition_density(TF, N,
                        priorweight  = 100,
                        stage_names  = stage_names,
                        title        = "Cypripedium calceolus — strong prior (100%)")
```

### Excluding the dead fate

```{r density_no_dead, fig.width = 9, fig.height = 7, fig.cap = "Posterior densities for survival transitions only (dead fate excluded)."}
plot_transition_density(TF, N,
                        stage_names  = stage_names,
                        include_dead = FALSE,
                        title        = "Cypripedium calceolus — transitions only")
```

## Summary

| Function | Purpose |
|---|---|
| `transition_CrI()` | Compute posterior beta credible intervals for all transitions |
| `plot_transition_CrI()` | Point-range plot of means and CIs, one panel per source stage |
| `plot_transition_density()` | Full posterior density curves arranged as a matrix plot |

For the full posterior density visualisation see `?plot_transition_density`.

## References


Salguero-Gómez, R., Jones, O.R., Archer, C.R., et al. (2015).
The COMPADRE Plant Matrix Database: an open online repository for plant
demography. *Journal of Ecology*, 103, 202–218.