mfrmr includes a bounded implementation of the
Generalized Partial Credit Model (GPCM; Muraki 1992). The estimator is
fully functional, but several downstream reporting helpers remain
restricted because score-side semantics under free discrimination differ
from the Rasch-family case. This vignette documents which helpers are
available, which are not, and what to use as a substitute when a helper
is restricted.
Do not choose GPCM only because it is the most flexible
model in the menu. Start with the score interpretation.
| Model | Use when | Main risk if over-used |
|---|---|---|
RSM |
The rating scale is intended to share one category-threshold structure across the step facet. | Real threshold differences can be hidden in residual diagnostics. |
PCM |
Thresholds may differ by item, criterion, task, or another designated step facet, but rating events should still contribute equally after conditioning on the modeled facets. | It can absorb threshold heterogeneity without asking whether some levels are more discriminating. |
bounded GPCM |
The analysis explicitly allows discrimination-based reweighting and treats slopes as part of the substantive sensitivity question. | Better statistical fit can be mistaken for a better operational scoring rule. |
This ordering matters for reporting. RSM and
PCM are the package’s equal-weighting reference route;
bounded GPCM is a slope-aware extension. If equal
contribution of items, criteria, or raters is part of the validity
argument, a better-fitting bounded GPCM should be reported
as sensitivity evidence rather than as an automatic replacement.
Use wording that matches the model actually fitted:
RSM: “We fit a many-facet rating-scale Rasch model,
treating category thresholds as common across the step facet.”PCM: “We fit a many-facet partial-credit Rasch model,
allowing thresholds to vary by the designated step facet while retaining
equal discrimination.”GPCM: “We fit a bounded generalized
partial-credit many-facet model as a slope-aware sensitivity analysis;
interpretation focused on whether discrimination-based reweighting
changed the substantive conclusions.”Avoid wording that says bounded GPCM “improves the
score” solely because it improves log-likelihood, AIC, or
BIC. The model can fit better while changing the scoring
contract.
gpcm_capability_matrix() is the canonical reference. It
returns one row per helper family with a Status column
drawn from supported, supported_with_caveat,
blocked, and deferred, plus the rationale and
the evidence trail behind each classification. The
RecommendedRoute column states what to do instead when a
helper is blocked or deferred, and NextValidationStep
records what evidence would be needed before broadening that route.
The matrix is intentionally conservative. A row stays in
blocked or deferred even when some lower-level
component already runs, because the scope statement reflects the
validation evidence rather than the raw code path.
The bounded GPCM route follows Muraki’s generalized
partial credit model and its information-function extension. The
package-specific slope_regime labels are narrower than that
model theory: they summarize the centered log-slope spread of the
simulation generator so recovery evidence can be read against a declared
stress condition. They are not model-fit tests and they are not
literature-derived adequacy cut points.
For simulation reporting, read direct recovery checks in an ADEMP-style order: the data-generating mechanism first, then the estimands and performance measures, and only then the row-level recovery diagnostics. In practice, this means:
mfrm_sim_spec.evaluate_mfrm_recovery() for the direct
parameter-recovery question.assess_mfrm_recovery() with practical RMSE/bias
limits.summary(recovery_review), then
recovery_review$condition_reporting_notes and
recovery_review$condition_review, then
recovery_review$diagnostic_reporting_notes and
recovery_review$diagnostic_review when optional diagnostics
were retained, then plot(recovery_review, type = "status"),
then
plot(recovery_review, type = "metrics", metric = "rmse").For release-scale checks, the packaged
recovery-validation.R protocol separates core release
evidence from extended sensitivity cases. Read
topline_release_decision before
condition_reporting_notes, condition_summary,
or row-level case tables, and treat
ExtendedSensitivityStatus as sensitivity evidence rather
than as the core release gate by itself. Fit/separation operating
characteristics belong in the diagnostic summary; they are not part of
the top-line release-recovery gate. Read
diagnostic_reporting_notes first when deciding whether zero
separation, reliability collapse, or df-sensitive ZSTD flags need
explicit report language.
The following routes are validated for bounded GPCM:
fit_mfrm(model = "GPCM", step_facet = ...). The validated
default keeps slope_facet == step_facet, with the direct
MML engine.predict_mfrm_units(),
sample_mfrm_plausible_values(),
compute_information(), and
plot_information().plot(fit, type = c("wright", "pathway", "ccc", "ccc_surface")),
category_structure_report(), and
category_curves_report().build_mfrm_sim_spec() and
simulate_mfrm_data().evaluate_mfrm_recovery() and
assess_mfrm_recovery(), including fitted bounded-GPCM slope
recovery on the log-slope scale.The following are exposed for GPCM but should be read as
exploratory screens rather than as Rasch-style invariance evidence:
diagnose_mfrm() and the residual and
unexpected-response stack: unexpected_response_table(),
displacement_table(),
measurable_summary_table(),
rating_scale_table(),
interrater_agreement_table(),
facet_quality_dashboard(),
plot_qc_dashboard(), plot_marginal_fit(),
plot_marginal_pairwise().reporting_checklist() and
precision_review_report() route to the supported direct
tables and plots. The broader APA/QC/export family is available as
caveated sensitivity-reporting output with explicit
gpcm_boundary rows.build_misfit_casebook() inherits the exploratory
screening framing of its underlying sources.estimate_bias() now provides bounded-GPCM conditional
screening rows with slope-aware information and profile-likelihood
columns. Treat these rows as screening evidence for follow-up, not as
standalone confirmatory fairness tests.analyze_dff(), analyze_dif(),
dif_interaction_table(), dif_report(),
plot_dif_heatmap(), and plot_dif_summary()
provide bounded-GPCM DFF/DIF screening and reporting surfaces with
explicit gpcm_boundary rows.build_apa_outputs(),
build_visual_summaries(), run_qc_pipeline(),
build_mfrm_manifest(),
build_mfrm_replay_script(),
export_mfrm_bundle(), package-native scorefile export, and
build_linking_review() return caveated
bounded-GPCM reporting or exploratory-review objects with
explicit gpcm_boundary rows. The package-native scorefile
can include native structural delta-method expected-score SEs and
score-side delta SEs selected by score_se_method when the
required MML diagnostics are available, but those SEs are not
FACETS-equivalent score-side uncertainty.evaluate_mfrm_design(),
predict_mfrm_population(),
evaluate_mfrm_diagnostic_screening(), and
evaluate_mfrm_signal_detection() are available as caveated
role-based repeated simulation/refit routes. Treat their outputs as
design-level or screening sensitivity evidence, not as operational
scoring, calibrated inferential testing, or arbitrary-facet planning
validation.The dashboard marks the fair-average panel unavailable under
GPCM; use fair_average_table() directly for
the slope-aware element-conditional table and
fair_average_table(fair_se = TRUE) when you need structural
fair-average SEs for non-person rows.
The slope-aware fair_average_table() route and
package-native scorefile route are available under GPCM,
including native expected-score uncertainty and score-side delta SEs
where the required MML diagnostics support them. Full FACETS-style
score-side compatibility remains restricted because free discrimination
changes the relationship between the latent measure and operational
score-side summaries. Specifically:
facets_output_contract_review() still depends on
FACETS-style compatibility semantics that are not generalized to free
discrimination.gpcm_boundary wording visible and
must not imply FACETS-equivalent score-side uncertainty, operational
scoring, calibrated screening gates, or arbitrary-facet planning
validation.When a restricted helper is needed for a GPCM report,
the practical paths are:
model = "PCM" if the discrimination-free
assumption is defensible for the data. The full APA / output-contract /
fit-based export stack becomes available, and
compare_mfrm() quantifies the loss in fit.GPCM fit itself but draft the
manuscript section manually around the supported tables:
summary(fit) for parameters, diagnose_mfrm()
for residual fit, facet_quality_dashboard() for the
per-facet quality summary, and compute_information() for
precision evidence.RSM or PCM baseline fit. The two fits can be
reported side by side in the same document, with the GPCM
fit footnoted as the discrimination-aware counterpart.Those restricted helpers use the same capability matrix at runtime. A
blocked or deferred bounded-GPCM call stops with the
relevant capability row, recommended route, and next validation step
instead of producing a partial score-side or unsupported backend result.
The condition class is mfrmr_gpcm_scope_error, and the
condition object carries helper, area,
status, recommended_route, and
next_validation_step fields so wrappers can catch and route
the failure without parsing the message text. The release-readiness
protocol checks that the blocked/deferred rows in
gpcm_capability_matrix() are represented in the runtime
guard coverage table or explicitly marked as roadmap-only. Call
gpcm_runtime_guard_coverage() to inspect that table. Use
mfrmr_output_guide("gpcm") when you want the shorter
user-facing route map that points to both the support matrix and guard
coverage.
The example_core dataset includes a small synthetic
block that supports a bounded GPCM fit. This example uses
compact quadrature and iteration settings to keep optional local
execution short; for final evidence, rerun with the package default or a
higher quadrature setting and a larger recovery design.
library(mfrmr)
toy <- load_mfrmr_data("example_core")
fit_gpcm <- fit_mfrm(
data = toy,
person = "Person",
facets = c("Rater", "Criterion"),
step_facet = "Criterion",
score = "Score",
model = "GPCM",
method = "MML",
quad_points = 7,
maxit = 20
)
summary(fit_gpcm)
diag_gpcm <- diagnose_mfrm(fit_gpcm)
summary(diag_gpcm)
info <- compute_information(fit_gpcm)
plot_information(info)
rec_gpcm <- evaluate_mfrm_recovery(
sim_spec = build_mfrm_sim_spec(
n_person = 30,
n_rater = 3,
n_criterion = 4,
raters_per_person = 2,
model = "GPCM",
step_facet = "Criterion",
slope_facet = "Criterion",
slopes = c(0.8, 1.0, 1.15, 1.05)
),
reps = 10,
model = "GPCM",
fit_method = "MML",
quad_points = 7,
maxit = 20,
include_diagnostics = TRUE,
diagnostic_fit_df_method = "both",
seed = 1
)
review_gpcm <- assess_mfrm_recovery(
rec_gpcm,
max_rmse = c(facet = 0.5, step = 0.5, slope = 0.25),
max_abs_bias = c(default = 0.25)
)
summary(review_gpcm)$overview
summary(review_gpcm)$reading_order
review_gpcm$condition_reporting_notes[, c(
"ConditionArea", "ReportingAttention", "ConditionFinding"
)]
review_gpcm$condition_review[, c(
"Model", "GPCMSlopeRegime", "StressLevel", "ScoreSupportStatus"
)]
review_gpcm$diagnostic_reporting_notes[, c(
"Facet", "ReportingAttention", "DiagnosticFinding"
)]
summary(review_gpcm)$diagnostic_review
plot(review_gpcm, type = "status")
plot(review_gpcm, type = "metrics", metric = "rmse")
# For a release-scale smoke read:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(
# case_ids = c("gpcm_slope_profile", "gpcm_high_dispersion_sparse"),
# quick = TRUE,
# verbose = FALSE
# )
# validation_summary <- summary(validation)
# validation_summary$reading_order
# validation_summary$topline_release_decision
# validation_summary$condition_reporting_notes
# validation_summary$condition_summary
# validation_summary$diagnostic_reporting_notes
# build_summary_table_bundle(validation_summary)$tables$reading_order
# build_summary_table_bundle(validation_summary)$tables$domain_decision_tableThe fit, summary, residual diagnostics, information, recovery,
fair-average, and conditional bias-screening helpers all run under
GPCM with the caveats listed above. Trying
build_apa_outputs(fit_gpcm) raises an explicit message
pointing back at gpcm_capability_matrix() rather than
producing a partial output.
The boundary above is a release-scope statement, not a permanent
design choice. Score-side semantics for free-discrimination polytomous
models are on the roadmap for a future release. Until then, the matrix
returned by gpcm_capability_matrix() is the binding
contract.