GPCM scope and current limitations

mfrmr includes a bounded implementation of the Generalized Partial Credit Model (GPCM; Muraki 1992). The estimator is fully functional, but several downstream reporting helpers remain restricted because score-side semantics under free discrimination differ from the Rasch-family case. This vignette documents which helpers are available, which are not, and what to use as a substitute when a helper is restricted.

Before fitting: model-choice triage

Do not choose GPCM only because it is the most flexible model in the menu. Start with the score interpretation.

Model Use when Main risk if over-used
RSM The rating scale is intended to share one category-threshold structure across the step facet. Real threshold differences can be hidden in residual diagnostics.
PCM Thresholds may differ by item, criterion, task, or another designated step facet, but rating events should still contribute equally after conditioning on the modeled facets. It can absorb threshold heterogeneity without asking whether some levels are more discriminating.
bounded GPCM The analysis explicitly allows discrimination-based reweighting and treats slopes as part of the substantive sensitivity question. Better statistical fit can be mistaken for a better operational scoring rule.

This ordering matters for reporting. RSM and PCM are the package’s equal-weighting reference route; bounded GPCM is a slope-aware extension. If equal contribution of items, criteria, or raters is part of the validity argument, a better-fitting bounded GPCM should be reported as sensitivity evidence rather than as an automatic replacement.

Report wording templates

Use wording that matches the model actually fitted:

Avoid wording that says bounded GPCM “improves the score” solely because it improves log-likelihood, AIC, or BIC. The model can fit better while changing the scoring contract.

Checking the support boundary

gpcm_capability_matrix() is the canonical reference. It returns one row per helper family with a Status column drawn from supported, supported_with_caveat, blocked, and deferred, plus the rationale and the evidence trail behind each classification. The RecommendedRoute column states what to do instead when a helper is blocked or deferred, and NextValidationStep records what evidence would be needed before broadening that route.

library(mfrmr)
gpcm_capability_matrix("supported")[, c("Area", "Status")]
gpcm_capability_matrix("supported_with_caveat")[, c("Area", "Status")]
gpcm_capability_matrix("blocked")[, c("Area", "Status", "RecommendedRoute")]
gpcm_capability_matrix("deferred")[, c("Area", "Status", "NextValidationStep")]

The matrix is intentionally conservative. A row stays in blocked or deferred even when some lower-level component already runs, because the scope statement reflects the validation evidence rather than the raw code path.

Source-grounded recovery interpretation

The bounded GPCM route follows Muraki’s generalized partial credit model and its information-function extension. The package-specific slope_regime labels are narrower than that model theory: they summarize the centered log-slope spread of the simulation generator so recovery evidence can be read against a declared stress condition. They are not model-fit tests and they are not literature-derived adequacy cut points.

For simulation reporting, read direct recovery checks in an ADEMP-style order: the data-generating mechanism first, then the estimands and performance measures, and only then the row-level recovery diagnostics. In practice, this means:

  1. Build or extract an explicit mfrm_sim_spec.
  2. Run evaluate_mfrm_recovery() for the direct parameter-recovery question.
  3. Run assess_mfrm_recovery() with practical RMSE/bias limits.
  4. Read summary(recovery_review), then recovery_review$condition_reporting_notes and recovery_review$condition_review, then recovery_review$diagnostic_reporting_notes and recovery_review$diagnostic_review when optional diagnostics were retained, then plot(recovery_review, type = "status"), then plot(recovery_review, type = "metrics", metric = "rmse").

For release-scale checks, the packaged recovery-validation.R protocol separates core release evidence from extended sensitivity cases. Read topline_release_decision before condition_reporting_notes, condition_summary, or row-level case tables, and treat ExtendedSensitivityStatus as sensitivity evidence rather than as the core release gate by itself. Fit/separation operating characteristics belong in the diagnostic summary; they are not part of the top-line release-recovery gate. Read diagnostic_reporting_notes first when deciding whether zero separation, reliability collapse, or df-sensitive ZSTD flags need explicit report language.

What works today

The following routes are validated for bounded GPCM:

What works with caveats

The following are exposed for GPCM but should be read as exploratory screens rather than as Rasch-style invariance evidence:

The dashboard marks the fair-average panel unavailable under GPCM; use fair_average_table() directly for the slope-aware element-conditional table and fair_average_table(fair_se = TRUE) when you need structural fair-average SEs for non-person rows.

What is intentionally restricted

The slope-aware fair_average_table() route and package-native scorefile route are available under GPCM, including native expected-score uncertainty and score-side delta SEs where the required MML diagnostics support them. Full FACETS-style score-side compatibility remains restricted because free discrimination changes the relationship between the latent measure and operational score-side summaries. Specifically:

A worked example

The example_core dataset includes a small synthetic block that supports a bounded GPCM fit. This example uses compact quadrature and iteration settings to keep optional local execution short; for final evidence, rerun with the package default or a higher quadrature setting and a larger recovery design.

library(mfrmr)
toy <- load_mfrmr_data("example_core")

fit_gpcm <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  step_facet = "Criterion",
  score = "Score",
  model = "GPCM",
  method = "MML",
  quad_points = 7,
  maxit = 20
)

summary(fit_gpcm)

diag_gpcm <- diagnose_mfrm(fit_gpcm)
summary(diag_gpcm)

info <- compute_information(fit_gpcm)
plot_information(info)

rec_gpcm <- evaluate_mfrm_recovery(
  sim_spec = build_mfrm_sim_spec(
    n_person = 30,
    n_rater = 3,
    n_criterion = 4,
    raters_per_person = 2,
    model = "GPCM",
    step_facet = "Criterion",
    slope_facet = "Criterion",
    slopes = c(0.8, 1.0, 1.15, 1.05)
  ),
  reps = 10,
  model = "GPCM",
  fit_method = "MML",
  quad_points = 7,
  maxit = 20,
  include_diagnostics = TRUE,
  diagnostic_fit_df_method = "both",
  seed = 1
)
review_gpcm <- assess_mfrm_recovery(
  rec_gpcm,
  max_rmse = c(facet = 0.5, step = 0.5, slope = 0.25),
  max_abs_bias = c(default = 0.25)
)

summary(review_gpcm)$overview
summary(review_gpcm)$reading_order
review_gpcm$condition_reporting_notes[, c(
  "ConditionArea", "ReportingAttention", "ConditionFinding"
)]
review_gpcm$condition_review[, c(
  "Model", "GPCMSlopeRegime", "StressLevel", "ScoreSupportStatus"
)]
review_gpcm$diagnostic_reporting_notes[, c(
  "Facet", "ReportingAttention", "DiagnosticFinding"
)]
summary(review_gpcm)$diagnostic_review
plot(review_gpcm, type = "status")
plot(review_gpcm, type = "metrics", metric = "rmse")

# For a release-scale smoke read:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(
#   case_ids = c("gpcm_slope_profile", "gpcm_high_dispersion_sparse"),
#   quick = TRUE,
#   verbose = FALSE
# )
# validation_summary <- summary(validation)
# validation_summary$reading_order
# validation_summary$topline_release_decision
# validation_summary$condition_reporting_notes
# validation_summary$condition_summary
# validation_summary$diagnostic_reporting_notes
# build_summary_table_bundle(validation_summary)$tables$reading_order
# build_summary_table_bundle(validation_summary)$tables$domain_decision_table

The fit, summary, residual diagnostics, information, recovery, fair-average, and conditional bias-screening helpers all run under GPCM with the caveats listed above. Trying build_apa_outputs(fit_gpcm) raises an explicit message pointing back at gpcm_capability_matrix() rather than producing a partial output.

Roadmap

The boundary above is a release-scope statement, not a permanent design choice. Score-side semantics for free-discrimination polytomous models are on the roadmap for a future release. Until then, the matrix returned by gpcm_capability_matrix() is the binding contract.