The `independenceWeights`

package constructs weights
designed to minimize the weighted statistical dependence between a
continuous exposure variable and a vector of confounder variables and
implements the methods of Huling, Greifer, and Chen (2021) for doing so.
In estimating a causal dose-response function, confounding bias is a
function of the dependence between confounders and the
exposure/treatment, so weights that minimize the dependence aim to
mitigate confounding bias directly.

Install the development version from GitHub with:

```
# install.packages("devtools")
::install_github("jaredhuling/independenceWeights") devtools
```

This is a basic example which shows how to estimate and utilize the distance covariance optimal weights (DCOWs) of Huling, Greifer, and Chen (2021):

`library(independenceWeights)`

Simulate data with a continuous treatment that has a confounded relationship with a response. Data are simulated according to the simulation setup of Vegetabile et al. (2021).

```
<- simulate_confounded_data(seed = 999, nobs = 500)
simdat <- simdat$data$Y ## response
y <- simdat$data$A ## treatment
A <- as.matrix(simdat$data[c("Z1", "Z2", "Z3", "Z4", "Z5")]) ## confounders X
```

Now estimate weights to adjust for confounders using the distance
covariance optimal weights (DCOWs), which aim to mitigate the dependence
between *A* and *X*:

```
<- independence_weights(A, X)
dcows
dcows#> Unweighted distance covariance: 0.3963
#> Optimized weighted dependence distance: 0.0246
#> Effective sample size: 264.0099
#>
#> Weight ranges:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0000 0.2767 0.8215 1.0000 1.4120 5.7360
```

Alternatively, information about any set of weights can be printed
via `weighted_energy_stats()`

. The unweighted distance
covariance is a measure of the unadjusted dependence between confounders
and treatment. The weighted dependence distance is a measure of this
dependence *after* weighting. The weighted energy distance
between A and weighted A measures how well the marginal distribution of
A is preserved after weighting. Similarly for X. The effective sample
size estimates the effective sample size after weighting.

```
weighted_energy_stats(A, X, dcows$weights)
#> Unweighted distance covariance: 0.3963
#> Weighted dependence distance: 0.0246
#> Weighted energy distance(A, weighted A): 0.0014
#> Weighted energy distance(X, weighted X): 0.0025
#> Effective sample size: 264.0099
```

We can assess traditional measures of balance such as marginal
weighted correlations with `cobalt`

:

`library(cobalt)`

Even though the DCOWs aim to mitigate joint dependence between X and
A, they result in very small marginal weighted correlations between them
as well. (changing `poly`

to 10 below would reveal that even
up to 10th degree polynomials of covariates as well-decorrelated):

```
bal.tab(A ~ Z1 + Z2 + Z3 + Z4 + Z5, data = data.frame(A = A, X),
weights = list(DCOW = dcows$weights),
un = TRUE, int = TRUE, poly = 1, stats = "cor")
#> Balance Measures
#> Type Corr.Un Corr.Adj
#> Z1 Contin. -0.1351 -0.0021
#> Z2 Contin. 0.4445 0.0161
#> Z3 Contin. -0.0397 -0.0284
#> Z4 Contin. 0.0267 -0.0057
#> Z5 Binary 0.0691 0.0009
#> Z1 * Z2 Contin. -0.0944 0.0016
#> Z1 * Z3 Contin. -0.1539 -0.0080
#> Z1 * Z4 Contin. -0.0168 -0.0007
#> Z1 * Z5_0 Contin. -0.1445 -0.0101
#> Z1 * Z5_1 Contin. 0.0438 0.0101
#> Z2 * Z3 Contin. 0.2647 -0.0129
#> Z2 * Z4 Contin. 0.0488 -0.0027
#> Z2 * Z5_0 Contin. -0.0266 -0.0005
#> Z2 * Z5_1 Contin. 0.0884 0.0027
#> Z3 * Z4 Contin. 0.0260 -0.0077
#> Z3 * Z5_0 Contin. -0.0730 -0.0064
#> Z3 * Z5_1 Contin. 0.0669 0.0019
#> Z4 * Z5_0 Contin. -0.0016 -0.0038
#> Z4 * Z5_1 Contin. 0.0486 -0.0037
#>
#> Effective sample sizes
#> Total
#> Unadjusted 500.
#> Adjusted 264.01
```

We can also minimize the weighted dependence subject to the constraint that marginal moments of X and A are decorrelated:

```
<- independence_weights(A, X,
dcows_decor decorrelate_moments = TRUE)
dcows_decor#> Unweighted distance covariance: 0.3963
#> Optimized weighted dependence distance: 0.0251
#> Effective sample size: 260.8907
#>
#> Weight ranges:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0000 0.2491 0.8344 1.0000 1.4330 5.7290
```

```
bal.tab(A ~ Z1 + Z2 + Z3 + Z4 + Z5, data = data.frame(A = A, X),
weights = list(DCOW = dcows$weights,
DCOW_decor = dcows_decor$weights),
un = TRUE, int = TRUE, stats = "cor")
#> Balance Measures
#> Type Corr.Un Corr.DCOW Corr.DCOW_decor
#> Z1 Contin. -0.1351 -0.0021 0.0000
#> Z2 Contin. 0.4445 0.0161 -0.0004
#> Z3 Contin. -0.0397 -0.0284 -0.0002
#> Z4 Contin. 0.0267 -0.0057 -0.0002
#> Z5 Binary 0.0691 0.0009 -0.0000
#> Z1 * Z2 Contin. -0.0944 0.0016 0.0023
#> Z1 * Z3 Contin. -0.1539 -0.0080 -0.0021
#> Z1 * Z4 Contin. -0.0168 -0.0007 0.0058
#> Z1 * Z5_0 Contin. -0.1445 -0.0101 -0.0087
#> Z1 * Z5_1 Contin. 0.0438 0.0101 0.0104
#> Z2 * Z3 Contin. 0.2647 -0.0129 -0.0024
#> Z2 * Z4 Contin. 0.0488 -0.0027 0.0019
#> Z2 * Z5_0 Contin. -0.0266 -0.0005 -0.0012
#> Z2 * Z5_1 Contin. 0.0884 0.0027 0.0012
#> Z3 * Z4 Contin. 0.0260 -0.0077 -0.0011
#> Z3 * Z5_0 Contin. -0.0730 -0.0064 -0.0020
#> Z3 * Z5_1 Contin. 0.0669 0.0019 0.0019
#> Z4 * Z5_0 Contin. -0.0016 -0.0038 0.0018
#> Z4 * Z5_1 Contin. 0.0486 -0.0037 -0.0032
#>
#> Effective sample sizes
#> Total
#> All 500.
#> DCOW 264.01
#> DCOW_decor 260.89
```

Now use the weights to estimate the causal average dose response function (ADRF)

```
## create grid
<- seq(min(simdat$data$A), 50, length.out=500)
trt_vec
## estimate ADRF
<- weighted_kernel_est(A, y, dcows$weights, trt_vec)$est
adrf_hat
## estimate naively without weights
<- weighted_kernel_est(A, y, rep(1, length(y)), trt_vec)$est
adrf_hat_unwtd
<- c(-4.75, 4.75)
ylims plot(x = simdat$data$A, y = simdat$data$Y, ylim = ylims,
xlim = c(0,50),
xlab = "A", ylab = "Y")
## true ADRF
lines(x = trt_vec, y = simdat$true_adrf(trt_vec), col = "blue", lwd=2)
## estimated ADRF
lines(x = trt_vec, y = adrf_hat, col = "red", lwd=2)
## naive estimate
lines(x = trt_vec, y = adrf_hat_unwtd, col = "green", lwd=2)
legend("bottomleft", c("True ADRF", "Unweighted Est.", "DCOW-weighted Est."),
col = c("blue", "green", "red"), lty = 1, lwd = 2)
```

Huling, Jared D, Noah Greifer, and Guanhua Chen. 2021. “Independence
Weights for Causal Inference with Continuous Exposures.” *arXiv
Preprint arXiv:2107.07086*. https://arxiv.org/abs/2107.07086.

Vegetabile, Brian G, Beth Ann Griffin, Donna L Coffman, Matthew Cefalu,
Michael W Robbins, and Daniel F McCaffrey. 2021. “Nonparametric
Estimation of Population Average Dose-Response Curves Using Entropy
Balancing Weights for Continuous Exposures.” *Health Services and
Outcomes Research Methodology* 21 (1): 69–110.