The brglm2 R
package provides the brnb()
function for fitting negative
binomial regression models (see Agresti
(2015), Section 7.3, for a recent account on negative binomial
regression models) using either maximum likelihood or any of the various
bias reduction and adjusted estimating functions methods provided by
brglmFit()
(see ?brglmFit
for resources).
This vignette demonstrates the use of brnb()
and of the
associated methods, using the case studies in Kenne Pagui, Salvan, and Sartori (2020).
Margolin, Kim, and Risko (1989) provide
data from an Ames salmonella reverse mutagenicity assay. The response
variable corresponds to the number of revertant colonies observed
(freq
) on each of three replicate plates
(plate
), and the covariate (dose
) is the dose
level of quinoline on the plate in micro-grams. The code chunk below
sets up a data frame with the data from replicate 1 in Margolin, Kim, and Risko (1989, Table 1).
freq <- c(15, 16, 16, 27, 33, 20,
21, 18, 26, 41, 38, 27,
29, 21, 33, 60, 41, 42)
dose <- rep(c(0, 10, 33, 100, 333, 1000), 3)
plate <- rep(1:3, each = 6)
(salmonella <- data.frame(freq, dose, plate))
#> freq dose plate
#> 1 15 0 1
#> 2 16 10 1
#> 3 16 33 1
#> 4 27 100 1
#> 5 33 333 1
#> 6 20 1000 1
#> 7 21 0 2
#> 8 18 10 2
#> 9 26 33 2
#> 10 41 100 2
#> 11 38 333 2
#> 12 27 1000 2
#> 13 29 0 3
#> 14 21 10 3
#> 15 33 33 3
#> 16 60 100 3
#> 17 41 333 3
#> 18 42 1000 3
The following code chunks reproduces Kenne Pagui, Salvan, and Sartori (2020, Table 2) by estimating the negative binomial regression model with log link and model formula
using the various estimation methods that brnb()
supports.
library("brglm2")
ames_ML <- brnb(ames_f, link = "log", data = salmonella,
transformation = "identity", type = "ML")
## Estimated regression and dispersion parameters
est <- coef(ames_ML, model = "full")
## Estimated standard errors for the regression parameters
sds <- sqrt(c(diag(ames_ML$vcov.mean), ames_ML$vcov.dispersion))
round(cbind(est, sds), 4)
#> est sds
#> (Intercept) 2.1976 0.3246
#> dose -0.0010 0.0004
#> log(dose + 10) 0.3125 0.0879
#> identity(dispersion) 0.0488 0.0281
The following code chunks updates the model fit using asymptotic mean-bias correction for estimating the model parameters
ames_BC <- update(ames_ML, type = "correction")
## Estimated regression and dispersion parameters
est <- coef(ames_BC, model = "full")
## Estimated standard errors for the regression parameters
sds <- sqrt(c(diag(ames_BC$vcov.mean), ames_BC$vcov.dispersion))
round(cbind(est, sds), 4)
#> est sds
#> (Intercept) 2.2098 0.3482
#> dose -0.0010 0.0004
#> log(dose + 10) 0.3105 0.0947
#> identity(dispersion) 0.0626 0.0328
The corresponding fit using mean-bias reducing adjusted score equations is
ames_BRmean <- update(ames_ML, type = "AS_mean")
## Estimated regression and dispersion parameters
est <- coef(ames_BRmean, model = "full")
## Estimated standard errors for the regression parameters
sds <- sqrt(c(diag(ames_BRmean$vcov.mean), ames_BRmean$vcov.dispersion))
round(cbind(est, sds), 4)
#> est sds
#> (Intercept) 2.2155 0.3515
#> dose -0.0010 0.0004
#> log(dose + 10) 0.3092 0.0956
#> identity(dispersion) 0.0647 0.0334
The corresponding fit using median-bias reducing adjusted score equations is
ames_BRmedian <- update(ames_ML, type = "AS_median")
## Estimated regression and dispersion parameters
est <- coef(ames_BRmedian, model = "full")
## Estimated standard errors for the regression parameters
sds <- sqrt(c(diag(ames_BRmedian$vcov.mean), ames_BRmedian$vcov.dispersion))
round(cbind(est, sds), 4)
#> est sds
#> (Intercept) 2.2114 0.3592
#> dose -0.0010 0.0004
#> log(dose + 10) 0.3091 0.0978
#> identity(dispersion) 0.0692 0.0350
As is done in Kosmidis, Kenne Pagui, and Sartori (2020, sec. 4) for generalized linear models, we can exploit the Fisher orthogonality of the regression parameters and the dispersion parameter and use a composite bias reduction adjustment to the score functions. Such an adjustment delivers mean-bias reduced estimates for the regression parameters and a median-bias reduced estimate for the dispersion parameter. The resulting estimates of the regression parameters are invariant in terms of their mean bias properties under arbitrary contrasts, and that of the dispersion parameter is invariant in terms of its median bias properties under monotone transformations.
Fitting the model using mixed-bias reducing adjusted score equations gives
ames_BRmixed <- update(ames_ML, type = "AS_mixed")
## Estimated regression and dispersion parameters
est <- coef(ames_BRmixed, model = "full")
## Estimated standard errors for the regression parameters
sds <- sqrt(c(diag(ames_BRmixed$vcov.mean), ames_BRmixed$vcov.dispersion))
round(cbind(est, sds), 4)
#> est sds
#> (Intercept) 2.2170 0.3591
#> dose -0.0010 0.0004
#> log(dose + 10) 0.3088 0.0978
#> identity(dispersion) 0.0693 0.0350
The differences between reduced-bias estimation and maximum likelihood are particularly pronounced for the dispersion parameter. Improved estimation of the dispersion parameter results to larger estimated standard errors than maximum likelihood. Hence, the estimated standard errors based on the maximum likelihood estimates appear to be smaller than they should be, which is also supported by the simulation results in Kenne Pagui, Salvan, and Sartori (2020, sec. 5).
?brglmFit
and ?brglm_control
contain quick
descriptions of the various bias reduction methods supported in
brglm2. The iteration
vignette describes the iteration and gives the mathematical details for
the bias-reducing adjustments to the score functions for generalized
linear models.
If you found this vignette or brglm2, in general,
useful, please consider citing brglm2 and the
associated paper. You can find information on how to do this by typing
citation("brglm2")
.