Getting Started

if(!requireNamespace("fabricatr", quietly = TRUE)) {
  install.packages("fabricatr")
}

library(CausalQueries)
library(fabricatr)
library(knitr)

Make a model

Generating: To make a model you need to provide a DAG statement to make_model.
For instance

# examples of models
xy_model <- make_model("X -> Y")
iv_model <- make_model("Z -> X -> Y <-> X")

Graphing: Once you have made a model you can inspect the DAG:

plot(iv_model)

Inspecting: The model has a set of parameters and a default distribution over these.

xy_model |> grab("parameters_df") |> kable()
param_names node gen param_set nodal_type given param_value priors
X.0 X 1 X 0 0.50 1
X.1 X 1 X 1 0.50 1
Y.00 Y 2 Y 00 0.25 1
Y.10 Y 2 Y 10 0.25 1
Y.01 Y 2 Y 01 0.25 1
Y.11 Y 2 Y 11 0.25 1

Tailoring: These features can be edited using set_restrictions, set_priors and set_parameters. Here is an example of setting a monotonicity restriction (see ?set_restrictions for more):

Here is an example of setting a monotonicity restriction (see ?set_restrictions for more):

iv_model <- 
  iv_model |> set_restrictions(decreasing('Z', 'X'))

Here is an example of setting priors (see ?set_priors for more):

iv_model <- 
  iv_model |> set_priors(distribution = "jeffreys")
#> No specific parameters to alter values for specified. Altering all parameters.

Simulation: Data can be drawn from a model like this:

data <- make_data(iv_model, n = 4) 

data |> kable()
Z X Y
0 0 0
0 0 1
1 1 0
1 1 0

Model updating

Updating: Update using update_model. You can pass all rstan arguments to update_model.

df <- fabricatr::fabricate(N = 100, X = rbinom(N, 1, .5), Y = rbinom(N, 1, .25 + X*.5))

xy_model <- 
  xy_model |> 
  update_model(df, refresh = 0)

Inspecting: You can access the posterior distribution on model parameters directly thus:


xy_model |> grab("posterior_distribution") |> 
  head() |> kable()
X.0 X.1 Y.00 Y.10 Y.01 Y.11
0.4466199 0.5533801 0.1636397 0.1492105 0.6489364 0.0382134
0.4789333 0.5210667 0.1054836 0.1500004 0.6119007 0.1326153
0.4718292 0.5281708 0.3269378 0.0828981 0.5177509 0.0724132
0.4542422 0.5457578 0.3249261 0.0082074 0.5356523 0.1312142
0.4871812 0.5128188 0.2682652 0.0363204 0.5857549 0.1096594
0.4524378 0.5475622 0.1585780 0.0936094 0.7336056 0.0142069

where each row is a draw of parameters.

Query model

Querying: You ask arbitrary causal queries of the model.

Examples of unconditional queries:

xy_model |> 
  query_model("Y[X=1] > Y[X=0]", using = c("priors", "posteriors")) |>
  kable()
query given using case_level mean sd cred.low cred.high
Y[X=1] > Y[X=0] - priors FALSE 0.248373 0.1904903 0.0076384 0.6923843
Y[X=1] > Y[X=0] - posteriors FALSE 0.587883 0.0888942 0.4039806 0.7455046

Examples of conditional queries:

xy_model |> 
  query_model("Y[X=1] > Y[X=0]", using = c("priors", "posteriors"),
              given = "X==1 & Y == 1") |>
  kable()
query given using case_level mean sd cred.low cred.high
Y[X=1] > Y[X=0] X==1 & Y == 1 priors FALSE 0.4963863 0.2859801 0.0251328 0.9727251
Y[X=1] > Y[X=0] X==1 & Y == 1 posteriors FALSE 0.8572202 0.0989188 0.6380308 0.9948683

Queries can even be conditional on counterfactual quantities. Here the probability of a positive effect given some effect:

xy_model |> 
  query_model("Y[X=1] > Y[X=0]", using = c("priors", "posteriors"),
              given = "Y[X=1] != Y[X=0]") |>
  kable()
query given using case_level mean sd cred.low cred.high
Y[X=1] > Y[X=0] Y[X=1] != Y[X=0] priors FALSE 0.5037974 0.2908795 0.0276229 0.9754840
Y[X=1] > Y[X=0] Y[X=1] != Y[X=0] posteriors FALSE 0.8650345 0.0733055 0.7222992 0.9929217