Spatial dynamic panel data modeling

library(SDPDmod)

This vignette gives a few examples on how to use the blmpSDPD and SDPDmod functions from the SDPDmod package.

Introduction

The general spatial static panel model takes the form:

\[\begin{equation} \begin{aligned} y_{t}&=\rho W y_{t} + X_{t} \beta + W X_{t} \theta + u_{t}, \\ u_{t} &=\lambda W u_{t}+\epsilon_{t} \end{aligned} \label{eq:mod1}\tag{1} \end{equation}\]

where the \(N \times 1\) vector \(y_{t}\) is the dependent variable, \(X_{t}\) is a \(N \times k\) matrix of \(k\) explanatory variables and \(W\) is a spatial weights matrix. \(N\) represents the number of units and \(t=1,...,T\) are the time points. The spatial lags for the vector of covariates is denoted with \(WX_t\). Spatial interaction in the error term \(u_{it}\) is included with the \(\lambda\) coefficient and it is assumed that \(\epsilon_t\) is independently and identically distributed error term for all \(t\) with zero mean and variance \(\sigma^2\). \(\rho\) is the spatial autoregressive coefficient, \(\lambda\) the spatial autocorrelation coefficient, \(\beta\) is a vector of response parameters for the covariates.

Figure 1. The different spatial panel models and their dependencies.
Figure 1. The different spatial panel models and their dependencies.

Note: SAC - spatial autoregressive combined, SDM - spatial Durbin model, SDEM - spatial Durbin error model, SAR - spatial autogregressive model (or Spatial lag model), SEM - spatial error model, SLX - spatially lagged X model.

However, model (1) suffers from identification problems (Elhorst 2010). Figure 1 shows the identifiable models, which are derived by imposing some restrictions on model (1). If all spatial coefficients are zero (\(\rho = \theta = \lambda = 0\)), then the corresponding model will be the ordinary least-squares model (OLS). If in each of the models in figure (1) \(\eta W y_{t-1} +\tau y_{t-1}\) is added, we get the corresponding dynamic panel data models. \(y_{(t-1)}\) denotes the time lag of the dependent variable and \(Wy_{(t-1)}\) is the time-space lag. \(\tau\) and \(\eta\) are the response parameters for the lagged variable.

LeSage and Parent (2007) and LeSage (2014) suggest a Bayesian approach for selecting an appropriate model. The calculation of the posterior probabilities for 6 models (SAR, SEM, SDM, SDEM, SLX, OLS) is possible with the function blmpSDPD.

The function SDPMm provides estimation of a SAR and SDM model with the Lee-Yu transformation approach (Yu, De Jong, and Lee (2008), Lee and Yu (2010b), Lee and Yu (2010a)).

Data

The Cigar data set (Baltagi and Levin (1992)) from the plm package (Croissant and Millo (2008)) will be used for describing the use of the blmpSDPD and SDPDm functions. It contains panel data of cigarette consumption in 46 states in the USA over the period from 1963 to 1992. The binary contiguity matrix of the 46 states is included in the SDPDmod package.

data("Cigar",package = "plm")
head(Cigar)
#>   state year price  pop  pop16  cpi      ndi sales pimin
#> 1     1   63  28.6 3383 2236.5 30.6 1558.305  93.9  26.1
#> 2     1   64  29.8 3431 2276.7 31.0 1684.073  95.4  27.5
#> 3     1   65  29.8 3486 2327.5 31.5 1809.842  98.5  28.9
#> 4     1   66  31.5 3524 2369.7 32.4 1915.160  96.4  29.5
#> 5     1   67  31.6 3533 2393.7 33.4 2023.546  95.5  29.6
#> 6     1   68  35.6 3522 2405.2 34.8 2202.486  88.4  32.0
data1<- Cigar
data1$logc<-log(data1$sales)
data1$logp<-log(data1$price/data1$cpi)
data1$logy<-log(data1$ndi/data1$cpi)
data1$lpm<-log(data1$pimin/data1$cpi)

data("usa46",package="SDPDmod") ## binary contiguity matrix of 46 USA states
str(usa46)
#>  num [1:46, 1:46] 0 0 0 0 0 0 0 1 1 0 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:46] "Alabama" "Arizona" "Arkansas" "California" ...
#>   ..$ : chr [1:46] "Alabama" "Arizona" "Arkansas" "California" ...
W <- rownor(usa46) ## row-normalization
isrownor(W) ## check if W is row-normalized
#> [1] TRUE

Bayesian posterior probabilities

Static panel model with spatial fixed effects

If the option dynamic is not set, then the default model is static. For spatial fixed effects effect should be set to “individual”.

res1<-blmpSDPD(formula = logc ~ logp+logy, data = data1, W = W,
               index = c("state","year"),
               model = list("ols","sar","sdm","sem","sdem","slx"), 
               effect = "individual")
res1
#>                    ols      sar       sdm     sem      sdem      slx
#> Log marginals 884.7551 938.6934 1046.4868 993.192 1039.6707 930.0585
#> Model probs     0.0000   0.0000    0.9989   0.000    0.0011   0.0000

Static panel model with time fixed effects

res2<-blmpSDPD(formula = logc ~ logp+logy, data = data1, W = W,
               index = c("state","year"),
               model = list("ols","sar","sdm","sem","sdem","slx"), 
               effect = "time")

Static panel model with both spatial and time fixed effects for 4 models (SAR, SDM, SEM, SDEM)

The default prior is uniform. With prior="beta" the beta prior is used.

res3<-blmpSDPD(formula = logc ~ logp+logy, data = data1, W = W,
               index = c("state","year"),
               model = list("sar","sdm","sem","sdem"), 
               effect = "twoways",
               prior = "beta")

Dynamic panel model with both spatial and time fixed effects with uniform prior

res4<-blmpSDPD(formula = logc ~ logp+logy, data = data1, W = W,
               index = c("state","year"),
               model = list("sar","sdm","sem","sdem","slx"), 
               effect = "twoways",
               ldet = "mc", ## log-determinant calculated with mcmc procedure
               dynamic = TRUE,
               prior = "uniform")

Dynamic panel model with both spatial fixed effects with beta prior, where the lagged dependent variable is included in the data

d2 <- plm::pdata.frame(data1, index=c('state', 'year'))
d2$llogc<-plm::lag(d2$logc) ## add lagged variable
data2<-d2[which(!is.na(d2$llogc)),]
rownames(data2)<-1:nrow(data2)
kk<-which(colnames(data2)=="llogc")
kk
#> [1] 14

res5<-blmpSDPD(formula = logc ~ logp+logy, data = data2, W = W,
               index = c("state","year"),
               model = list("sar","sdm","sem","sdem"), 
               effect = "individual",
               ldet = "full",
               dynamic = TRUE,
               tlaginfo = list(ind=kk),
               prior = "beta")

Spatial (dynamic) panel data modeling

Static spatial lag model with spatial fixed effects

mod1<-SDPDm(formula = logc ~ logp+logy, data = data1, W = W,
            index = c("state","year"),
            model = "sar", 
            effect = "individual")
summary(mod1)
#> sar panel model with individual fixed effects
#> 
#> Call:
#> SDPDm(formula = logc ~ logp + logy, data = data1, W = W, index = c("state", 
#>     "year"), model = "sar", effect = "individual")
#> 
#> Spatial autoregressive coefficient:
#>     Estimate Std. Error t-value  Pr(>|t|)    
#> rho 0.297576   0.028444  10.462 < 2.2e-16 ***
#> 
#> Coefficients:
#>        Estimate Std. Error  t-value Pr(>|t|)    
#> logp -0.5320053  0.0254445 -20.9085   <2e-16 ***
#> logy -0.0007088  0.0152139  -0.0466   0.9628    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod1$rsqr
#> [1] 0.8677243
mod1$sige
#>        sige 
#> 0.006667779

Dynamic spatial lag model with spatial fixed effects

dynamic should be set to TRUE as well as the option tl in tlaginfo for inclusion of time lag. For space-time lag effect, the option stl in tlaginfo should also be set to TRUE.

mod2<-SDPDm(formula = logc ~ logp+logy, data = data1, W = W,
            index = c("state","year"),
            model = "sar", 
            effect = "individual",
            dynamic = T,
            tlaginfo = list(ind = NULL, tl = T, stl = T))
summary(mod2)
#> sar dynamic panel model with individual fixed effects
#> 
#> Call:
#> SDPDm(formula = logc ~ logp + logy, data = data1, W = W, index = c("state", 
#>     "year"), model = "sar", effect = "individual", dynamic = T, 
#>     tlaginfo = list(ind = NULL, tl = T, stl = T))
#> 
#> Spatial autoregressive coefficient:
#>     Estimate Std. Error t-value  Pr(>|t|)    
#> rho 0.305592   0.031396  9.7334 < 2.2e-16 ***
#> 
#> Coefficients:
#>               Estimate Std. Error t-value Pr(>|t|)    
#> logc(t-1)    0.8697327  0.0130098 66.8520  < 2e-16 ***
#> W*logc(t-1) -0.2796636  0.0336333 -8.3151  < 2e-16 ***
#> logp        -0.1147081  0.0138649 -8.2733  < 2e-16 ***
#> logy        -0.0206479  0.0079911 -2.5839  0.00977 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Dynamic spatial model with spatial fixed effects (with Lee-Yu transformation)

mod3<-SDPDm(formula = logc ~ logp+logy, data = data1, W = W,
            index = c("state","year"),
            model = "sar", 
            effect = "individual",
            LYtrans = T,
            dynamic = T,
            tlaginfo = list(ind = NULL, tl = T, stl = T))
summary(mod3)
#> sar dynamic panel model with individual fixed effects
#> 
#> Call:
#> SDPDm(formula = logc ~ logp + logy, data = data1, W = W, index = c("state", 
#>     "year"), model = "sar", effect = "individual", dynamic = T, 
#>     tlaginfo = list(ind = NULL, tl = T, stl = T), LYtrans = T)
#> 
#> Spatial autoregressive coefficient:
#>     Estimate Std. Error t-value  Pr(>|t|)    
#> rho 0.310875   0.031508  9.8667 < 2.2e-16 ***
#> 
#> Coefficients:
#>               Estimate Std. Error t-value  Pr(>|t|)    
#> logc(t-1)    0.9287971  0.0132188 70.2634 < 2.2e-16 ***
#> W*logc(t-1) -0.3030634  0.0350687 -8.6420 < 2.2e-16 ***
#> logp        -0.0864323  0.0138446 -6.2430 4.292e-10 ***
#> logy        -0.0217271  0.0081269 -2.6735  0.007507 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod4<-SDPDm(formula = logc ~ logp+logy, data = data2, W = W,
            index = c("state","year"),
            model = "sar",
            effect = "individual",
            LYtrans = T,
            dynamic = T,
            tlaginfo = list(ind = kk, tl = T, stl = F))
summary(mod4)
#> sar dynamic panel model with individual fixed effects
#> 
#> Call:
#> SDPDm(formula = logc ~ logp + logy, data = data2, W = W, index = c("state", 
#>     "year"), model = "sar", effect = "individual", dynamic = T, 
#>     tlaginfo = list(ind = kk, tl = T, stl = F), LYtrans = T)
#> 
#> Spatial autoregressive coefficient:
#>     Estimate Std. Error t-value Pr(>|t|)    
#> rho 0.095318   0.016432  5.8008  6.6e-09 ***
#> 
#> Coefficients:
#>            Estimate Std. Error t-value  Pr(>|t|)    
#> logc(t-1)  0.920917   0.013658 67.4253 < 2.2e-16 ***
#> logp      -0.051035   0.014055 -3.6311 0.0002822 ***
#> logy      -0.031962   0.008385 -3.8118 0.0001380 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Dynamic spatial Durbin model with spatial and time fixed effects (with Lee-Yu transformation)

mod5<-SDPDm(formula = logc ~ logp+logy, data = data1, W = W,
            index = c("state","year"),
            model = "sdm", 
            effect = "twoways",
            LYtrans = T,
            dynamic = T,
            tlaginfo = list(ind = NULL, tl = T, stl = T))
summary(mod5)
#> sdm dynamic panel model with twoways fixed effects
#> 
#> Call:
#> SDPDm(formula = logc ~ logp + logy, data = data1, W = W, index = c("state", 
#>     "year"), model = "sdm", effect = "twoways", dynamic = T, 
#>     tlaginfo = list(ind = NULL, tl = T, stl = T), LYtrans = T)
#> 
#> Spatial autoregressive coefficient:
#>     Estimate Std. Error t-value Pr(>|t|)    
#> rho 0.162189   0.036753  4.4129 1.02e-05 ***
#> 
#> Coefficients:
#>              Estimate Std. Error  t-value  Pr(>|t|)    
#> logc(t-1)    0.864412   0.012879  67.1163 < 2.2e-16 ***
#> W*logc(t-1) -0.096270   0.038810  -2.4805 0.0131186 *  
#> logp        -0.270872   0.023145 -11.7031 < 2.2e-16 ***
#> logy         0.104262   0.029783   3.5007 0.0004641 ***
#> W*logp       0.195595   0.043870   4.4585 8.254e-06 ***
#> W*logy      -0.032464   0.039520  -0.8215 0.4113891    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Calculating impacts.

imp  <- impactsSDPDm(mod5)
summary(imp)
#> 
#> Impact estimates for spatial dynamic model
#> ========================================================
#> Short-term
#> 
#> Direct:
#>       Estimate Std. Error t-value  Pr(>|t|)    
#> logp -0.261569   0.022830 -11.457 < 2.2e-16 ***
#> logy  0.101759   0.029667   3.430 0.0006035 ***
#> 
#> Indirect:
#>       Estimate Std. Error t-value  Pr(>|t|)    
#> logp  0.178932   0.046861  3.8183 0.0001344 ***
#> logy -0.015109   0.042210 -0.3579 0.7203812    
#> 
#> Total:
#>       Estimate Std. Error t-value Pr(>|t|)  
#> logp -0.082637   0.052143 -1.5848   0.1130  
#> logy  0.086650   0.037890  2.2868   0.0222 *
#> ========================================================
#> Long-term
#> 
#> Direct:
#>      Estimate Std. Error t-value  Pr(>|t|)    
#> logp -1.92836    0.20580 -9.3702 < 2.2e-16 ***
#> logy  0.80149    0.22655  3.5378 0.0004034 ***
#> 
#> Indirect:
#>      Estimate Std. Error t-value Pr(>|t|)
#> logp  0.91054    0.58271  1.5626   0.1181
#> logy  0.48361    1.54612  0.3128   0.7544
#> 
#> Total:
#>      Estimate Std. Error t-value Pr(>|t|)
#> logp -1.01783    0.66733 -1.5252   0.1272
#> logy  1.28510    1.59825  0.8041   0.4214
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

References

Baltagi, Badi H, and Dan Levin. 1992. “Cigarette Taxation: Raising Revenues and Reducing Consumption.” Structural Change and Economic Dynamics 3 (2): 321–35.
Croissant, Yves, and Giovanni Millo. 2008. “Panel Data Econometrics in r: The Plm Package.” Journal of Statistical Software 27 (2).
Elhorst, J Paul. 2010. “Applied Spatial Econometrics: Raising the Bar.” Spatial Economic Analysis 5 (1): 9–28.
Lee, Lung-fei, and Jihai Yu. 2010a. “A Spatial Dynamic Panel Data Model with Both Time and Individual Fixed Effects.” Econometric Theory 26 (2): 564–97. https://doi.org/10.1017/S0266466609100099.
———. 2010b. “Estimation of Spatial Autoregressive Panel Data Models with Fixed Effects.” Journal of Econometrics 154 (2): 165–85. https://doi.org/10.1016/j.jeconom.2009.08.001.
LeSage, James P. 2014. “Spatial Econometric Panel Data Model Specification: A Bayesian Approach.” Spatial Statistics 9: 122–45. https://doi.org/10.1016/j.spasta.2014.02.002.
LeSage, James P, and Olivier Parent. 2007. “Bayesian Model Averaging for Spatial Econometric Models.” Geographical Analysis 39 (3): 241–67.
Yu, Jihai, Robert De Jong, and Lung-fei Lee. 2008. “Quasi-Maximum Likelihood Estimators for Spatial Dynamic Panel Data with Fixed Effects When Both n and t Are Large.” Journal of Econometrics 146 (1): 118–34. https://doi.org/10.1016/j.jeconom.2008.08.002.