BLE_Categorical

library(BayesSampling)

Application of the BLE to categorical data

(From Section 4 of the “Gonçalves, Moura and Migon: Bayes linear estimation for finite population with emphasis on categorical data”)

In a situation where the population can be divided into different and exclusive categories, we can calculate the Bayes Linear Estimator for the proportion of individuals in each category with the BLE_Categorical() function, which receives the following parameters:

Vague Prior Distribution

Letting \(\rho_{ii} \to 1\), that is, assuming prior ignorance, the resulting point estimate will be the same as the one seen in the design-based context for categorical data. 

This can be achieved using the BLE_Categorical() function by omitting either the prior proportions and/or the parameter rho, that is:

R and Vs Matrices

If the calculation of matrices R and Vs results in non-positive definite matrices, a warning will be displayed. In general this does not produce incorrect/ inconsistent results for the proportion estimate but for its associated variance. It is suggested to review the prior correlation coefficients (parameter rho).

Examples

  1. Example presented in the mentioned article (2 categories)
ys <- c(0.2614, 0.7386)
n <- 153
N <- 15288
m <- c(0.7, 0.3)
rho <- matrix(0.1, 1)
Estimator <- BLE_Categorical(ys,n,N,m,rho)

Estimator$est.prop
#> [1] 0.2855228 0.7144772
Estimator$Vest.prop
#>              [,1]         [,2]
#> [1,]  0.001155671 -0.001155671
#> [2,] -0.001155671  0.001155671

Bellow we can see that the greater the correlation coefficient, the closer our estimation will get to the sample proportions.

ys <- c(0.2614, 0.7386)
n <- 153
N <- 15288
m <- c(0.7, 0.3)
rho <- matrix(0.5, 1)
Estimator <- BLE_Categorical(ys,n,N,m,rho)

Estimator$est.prop
#> [1] 0.2642195 0.7357805
Estimator$Vest.prop
#>               [,1]          [,2]
#> [1,]  0.0006750388 -0.0006750388
#> [2,] -0.0006750388  0.0006750388
  1. Example from the help page (3 categories)
ys <- c(0.2, 0.5, 0.3)
n <- 100
N <- 10000
m <- c(0.4, 0.1, 0.5)
mat <- c(0.4, 0.1, 0.1, 0.1, 0.2, 0.1, 0.1, 0.1, 0.6)
rho <- matrix(mat, 3, 3)

Estimator <- BLE_Categorical(ys,n,N,m,rho)

Estimator$est.prop
#> [1] 0.2221967 0.4785131 0.2992902
Estimator$Vest.prop
#>               [,1]          [,2]          [,3]
#> [1,]  0.0013711226 -0.0004980297 -0.0008730929
#> [2,] -0.0004980297  0.0006722052 -0.0001741755
#> [3,] -0.0008730929 -0.0001741755  0.0010472684

Same example, but with no prior correlation coefficients informed (non-informative prior)

ys <- c(0.2, 0.5, 0.3)
n <- 100
N <- 10000
m <- c(0.4, 0.1, 0.5)

Estimator <- BLE_Categorical(ys,n,N,m,rho=NULL)
#> parameter 'rho' not informed, non informative prior correlation coefficients used in estimations
#> Warning in BLE_Categorical(ys, n, N, m, rho = NULL): 'Vest.prop' should have
#> only positive diagonal values. Review prior specification and verify calculated
#> matrices 'R' and 'Vs'.

Estimator$est.prop
#> [1] 0.2017585 0.4996729 0.2985685