predict.hazard(type = "hazard") now works for
multiphase models, returning the instantaneous additive
hazard h(t|x) = sum_j mu_j(x) phi_j'(t) (previously only
single-distribution models supported "hazard", via
exp(eta)). Like "survival" /
"cumulative_hazard" it is time-based (requires
newdata$time), supports covariate newdata, and
se.fit = TRUE (delta-method limits on the log scale via a
numeric Jacobian of the hazard evaluator). decompose = TRUE
is not supported for "hazard". This gives the multiphase
instantaneous hazard a public route (it was previously reachable only
through internal functions).
predict.hazard(..., se.fit = TRUE, conf.type = "logit")
selects the survival confidence-limit transform. The default
"log-log" builds limits on log(-log S) (the
survival::survfit standard); "logit" builds
them on logit(1 - S), reproducing SAS HAZARD’s
HAZPRED survival limits. With the full-information vcov for
CoE fits, conf.type = "logit" matches the SAS
hp.death.AVC survival CLs to ~1e-5. Hazard /
cumulative-hazard limits are unaffected (their log scale already matches
HAZPRED).
predict.hazard(type = "cumulative_hazard", decompose = TRUE, se.fit = TRUE)
now returns per-phase and total delta-method confidence
limits for multiphase models, as a long data frame (time,
component, fit, se.fit,
lower, upper). Each phase’s CL uses only that
phase’s parameters, so per-phase limits do not sum to the total.
Previously this combination raised an error.
hzr_deciles() now matches the SAS
deciles.hazard macro exactly. Previously it
excluded subjects censored before the horizon and defined the expected
count as sum(1 - S(horizon)). It now follows the SAS
method: all subjects are ranked into equal-sized risk
groups by predicted survival at the horizon, and the expected count per
group is the sum of predicted cumulative hazard at each
subject’s own follow-up time (so group totals sum to the total
observed events under conservation of events). The time
argument now only stratifies subjects into risk groups; it no longer
restricts or excludes any subject, and the expected/observed totals are
horizon-independent. Verified to reproduce the
hm.death.AVC.deciles SAS decile table
(CASES/EXPECTED/ACTUAL) to print precision. The output columns are
unchanged; their definitions are updated in
?hzr_deciles.Conservation-of-Events fits now report the
full-information variance. CoE removes one phase’s
log_mu from the optimizer search (its score
equation is the CoE constraint), but the previous code also dropped it
from the uncertainty – the conserved phase got an
NA standard error, and anything depending on it (other SEs,
se(H), prediction confidence limits) was understated
wherever that phase contributed. At the optimum the CoE solution is the
unconstrained MLE, so vcov() is now recomputed from the
unconstrained-objective Hessian over the full free set (including the
conserved log_mu), matching an all-mu-free
(conserve = FALSE) fit at the same point. On
hz.death.AVC every parameter SE now matches the SAS HAZARD
reference (e.g. the conserved early log_mu: 0.133 vs the
previous ~0.059). The recomputation uses numDeriv
(Suggests) and an invertible Hessian; if either is unavailable the fit
emits a warning and the conserved log_mu retains an
NA standard error (as before).
Conservation of Events ignored left-truncation
(counting-process entry times). For multiphase fits on
Surv(start, stop, event) data, the CoE reparameterization
conserved Sum H(stop) while the likelihood scores the
intercepts on the entry-time scale,
Sum E = Sum [H(stop) - H(start)]. The conserved phase
therefore absorbed the spurious Sum H(start), biasing its
intercept and lowering the attained log-likelihood (the
hz.te123.OMC fit-1 parity offset, gap-list P1 #6).
.hzr_conserve_events() and
.hzr_select_fixmu_phase() now subtract the per-phase
entry-time cumulative hazard, matching the likelihood and C HAZARD
setcoe under LCENSOR/ STARTTME.
Plain right-censored fits (no start time) are
unaffected.
vcov() was unusable for multiphase fits and
returned an unnamed matrix. vcov.hazard()
collapsed the entire matrix to a scalar NA whenever any
cell was NA. Multiphase fits legitimately have
NA variance rows – for parameters held fixed (e.g. early
shapes) and for the Conservation-of-Events-conserved phase
log_mu – so the finite free-parameter block was discarded
for almost every multiphase model. The method now returns the full
matrix with NA rows preserved and labels rows and columns
with the coefficient names (phase-prefixed for multiphase, e.g.
early.x vs constant.x), so a covariate shared
across phases resolves to distinct, name-addressable slots. A scalar
NA is returned only when no covariance matrix is
available.
Weibull analytic gradient produced NaN for
right-censored time = 0 rows.
.hzr_gradient_weibull() used an unguarded
log(time) in the shape (nu) score; a legal
right-censored row at time = 0 made
0 * -Inf = NaN, which poisoned the entire summed
shape-gradient component (then silently zeroed by the optimizer, harming
convergence). log(time) is now guarded with
log(pmax(time, .Machine$double.xmin)), matching the
analytic Hessian. The other families were audited: exponential (no
log(time) in the score), log-normal (rejects
time = 0), and multiphase (the decomposition clamps
time) are unaffected.
Weibull event hazard was inconsistent with its cumulative
hazard. .hzr_logl_weibull() defined the event
hazard as mu*nu*t^(nu-1)*exp(eta) while the cumulative
hazard was (mu*t)^nu*exp(eta); the former is missing a
mu^(nu-1) factor (the exact derivative is
nu*mu^nu*t^(nu-1)*exp(eta) = (nu/t)*H, Form A as in the
C/SAS HAZARD reference). The natural-scale log-likelihood and its
analytic gradient (d/dmu, d/dnu event terms)
are corrected to match. Pure event/right-censored fits were already
correct (they use the self-consistent internal reparameterization); the
visible effect is on mixed event + interval/left-censored
Weibull fits, which delegate to this likelihood and previously
optimized a slightly mis-specified event term.
Weibull gradient attribute ignored observation
weights.
.hzr_logl_weibull(..., return_gradient = TRUE) attached an
unweighted gradient even when weights were supplied (the
analytic gradient was off by the weight scale, e.g. halved under
weights = 2). weights is now forwarded to the
score computation. The model-fitting path was unaffected (it uses a
separate internal weighted gradient); this only changes callers reading
the return_gradient = TRUE attribute on weighted
data.
hzr_bootstrap() was non-functional for
weighted fits (Phase 7c). The resample loop rewired only
data in the refit call, leaving the original
weights argument bound to a symbol in the caller’s
frame. The internal eval() could not resolve that symbol,
so every replicate of a weighted model errored out
(n_success == 0) regardless of fraction; even
had it resolved, the un-resampled weights would have been misaligned
with the bootstrapped rows. weights is now evaluated once
and resampled in lockstep with the data on each replicate (mirroring how
data is handled). Unweighted bootstraps are unaffected. A
regression test covers both the fraction < 1 and
full-size weighted paths in test-diagnostics.R. Follow-up:
hzr_bootstrap() now resamples the weights already stored on
the fitted object (object$data$weights) rather than
re-evaluating the call’s weights expression in
parent.frame(), which fails when the original symbol is no
longer in scope (e.g. the fit was built inside a helper that has
returned). Caller-frame evaluation remains a fallback for objects fitted
before weights were stored. The same fragility applied to the call’s
data argument: hazard() now stores the
evaluated data argument (the data frame passed to
hazard(), not a model.frame() result) on the
fitted object (object$data$frame), and
hzr_bootstrap() resamples that stored frame instead of
re-evaluating cl$data in parent.frame(), so
bootstrap succeeds even when the original data symbol is
out of scope. Caller-frame evaluation remains a fallback for objects
fitted before the frame was stored.
4-phase CoE fixmu-phase selection (Phase 7d).
.hzr_select_fixmu_phase() used which.max()
over raw per-phase cumhaz at the starting theta. G3 late phases with
typical shape parameters have unnormalized cumhaz orders of magnitude
larger than other phases, causing CoE to pin the G3 log_mu
away from its true near-zero MLE. Fixed by excluding phases whose cumhaz
contribution exceeds 10× the median before selecting (falls back to
which.max when all phases are outliers). On the 4-phase
CABGKUL fit the CoE vs no-CoE LL gap closes from 6.9 to < 0.1 units.
Six new tests cover the 4-phase code path in
test-conservation-of-events.R.
time_lower dual-use bug in Weibull and
multiphase likelihoods. When time_lower was
supplied for a mixed interval-censored + right-censored dataset, the
Weibull LL interpreted time_lower as the counting-process
entry time for right-censored rows, computing H(stop) −
H(start) = 0 and silently zeroing those rows’ likelihood contribution.
Fixed in likelihood-weibull.R (4 sites: LL, gradient,
L-BFGS-B internal LL/gradient) and likelihood-multiphase.R:
start_vec is now set from time_lower only for
genuine epoch rows (status %in% c(0L, 1L) and
time_lower < time). Two regression tests added to
test-interval-censoring-weibull.R.
hzr_decompos() Case 3 corrected and
nu = 0, m >= 0 now fails loud (Phase 7d). Two
issues in the early-phase (G1) sign dispatch:
m > 0, nu < 0, “bounded
cumulative”) carried a spurious factor of m. Its
rho used a bare (2^m - 1)^nu instead of the
((2^m - 1)/m)^nu form used by Case 1, leaving an
m factor on the bt^(-1/nu) term. The CDF
diverged from the C HAZARD G1 evaluator (g1flag = 5) by up
to ~0.2 and was discontinuous with its m -> 0 limit
(Case 3L). Adding the /m divisor makes the m
factors cancel, reproducing the C evaluator exactly and restoring
continuity (verified against
src/common/hzd_ln_G1_and_SG1.c). No shipped phase uses Case
3, so fitted models are unaffected; the synthetic 3-phase golden fixture
was regenerated because its free-shape optimizer path crosses Case 3
territory.nu = 0 with m >= 0
fell through every dispatch branch, leaving the CDF unassigned and
raising the cryptic object 'G' not found. The
nu -> 0 limit is defined only for m < 0;
for m >= 0 it is degenerate. The function now raises a
clear, explanatory error. New test-decompos-boundary.R
locks in continuity of all limiting branches (Case 1 -> 1L, 2 ->
1L, 2 -> 2L, 3 -> 3L), Case 3 <-> C g1flag=5
parity, g = dG/dt internal consistency, CDF sanity, and
stability at extreme t_half.Hardened Hessian inversion for standard errors (Phase
7c). Post-fit variance-covariance estimation now symmetrizes
the Hessian, checks its reciprocal condition number, inverts via
Cholesky with a solve() fallback for non-positive-definite
Hessians, and guards non-positive variances instead of silently emitting
NaN standard errors. Ill-conditioned,
non-positive-definite, and non-finite Hessians now raise specific, named
warnings, and fits carry rcond / pd
diagnostics that summary() surfaces as a note when a fit is
flagged. This closes the “12+-parameter Hessian stability” hardening
item for the inversion layer; analytic Hessians (more accurate standard
errors) follow in subsequent releases.
Analytic Hessian for exponential standard errors (Phase
7c, Layer 2). The exponential distribution now computes its
post-fit Hessian in closed form (X~' diag(wH) X~ over event
+ right-censored rows) rather than numerically, giving more accurate
standard errors. The shared optimizer gained a hessian_fn
hook that analytic Hessians for the remaining families will reuse;
left/interval-censored exponential fits fall back to the numerical
Hessian.
Analytic Hessian for Weibull standard errors (Phase 7c,
Layer 2). The Weibull distribution now computes its post-fit
Hessian in closed form on the internal (alpha, psi, beta)
optimization scale (then mapped to the natural scale by the existing
delta method) rather than numerically, giving more accurate standard
errors. Covers event + right-censored data (including counting-process
start times); left/interval-censored fits fall back to the numerical
Hessian.
Analytic Hessian for log-logistic standard errors (Phase
7c, Layer 2). The log-logistic distribution now computes its
post-fit Hessian in closed form on the internal
(log alpha, log beta, beta_coef) scale rather than
numerically, giving more accurate standard errors. Covers event +
right-censored data; left/interval-censored fits fall back to the
numerical Hessian.
Analytic Hessian for log-normal standard errors (Phase
7c, Layer 2). The log-normal distribution now computes its
post-fit Hessian in closed form on the internal
(mu, log_sigma, beta_coef) scale rather than numerically,
giving more accurate standard errors. Covers event + right-censored
data; left/interval-censored fits fall back to the numerical
Hessian.
Analytic Hessian for multiphase standard errors (Phase 7c, Layer 2 PR-6). Post-fit standard errors for all multiphase fits now come from a closed-form Hessian of the negative log-likelihood rather than a numerical Richardson approximation. The Hessian is assembled from three terms: (A) a phase-block-diagonal curvature of Σᵢ wᵢ H(tᵢ), (B) a dense Fisher information outer product Σₑ (wᵢ/hᵢ²) ∇h ∇hᵀ capturing cross-phase parameter interactions, and (C) a phase-block-diagonal curvature of −Σₑ wᵢ log h(tᵢ). μ/β parameters use fully closed-form expressions; shape parameters (t_half, ν, m, and G3 parameters) use second-order central differences. The Conservation-of-Events full-information vcov path also switches to the analytic Hessian. Left/interval-censored fits fall back to the numerical Hessian. Completes the 6-PR analytic-Hessian rollout across all five families.
vignette("fitting-hazard-models") gains an
Interval and left censoring section covering: status
coding reference
(-1/0/1/2), a
cardiac clinic-visit simulation with right- and interval-censored
observations, the direct time_lower/time_upper
API, and a comparison showing the interval-censored fit recovering
nu close to 1.0 (true value) while the naive exact-at-upper
fit incurs a shape bias of ~+0.45. Includes a callout note on the
correct use of time_lower = 0 for right-censored rows.vignette("fitting-hazard-models") gains a
Convergence troubleshooting section covering: reading
the KM cumulative hazard for Weibull starting values (log-log plot),
when to fix shape parameters vs. estimate freely, diagnosing
overparameterization via near-zero phase scales and NA from
vcov(), and control options
(n_starts, maxit).?TemporalHazard) giving the additive multiphase model, the
phase-type vocabulary, the SAS/C HAZARD bridge, and a map of the main
entry points.randomForestSRC: explicit display equations for
the generalized temporal decomposition G(t)
(?hzr_decompos), the additive cumulative-hazard model on
?hzr_phase and ?hazard, and defining formulas
plus the Mächler (2012) reference for the numerical primitives
(?hzr_log1pexp, ?hzr_log1mexp,
?hzr_clamp_prob).hzr_phase() phase-type help. ?hazard gains a
Baseline distributions section describing each
dist value ("weibull",
"exponential", "loglogistic",
"lognormal", "multiphase") by its hazard shape
and when to use it; ?hzr_stepwise gains a Selection
direction and criterion section explaining each
direction
("forward"/"backward"/"both") and
criterion ("wald"/"aic"),
including how Wald selection differs from C/SAS HAZARD’s score-statistic
path.Patient-specific HAZPRED prediction parity
(Group A fixtures hp.death.AVC.hm1 / hm2). New
test-sas-parity.R blocks predict survival and instantaneous
hazard – with logit survival CLs and log hazard CLs at the SAS 1-SD
level – from the saved multivariable both-phase model
(hm.death.AVC final fit, “HMDEATH”) for two covariate
profiles each (hm1: with/without an associated cardiac anomaly; hm2:
complete vs partial canal by date of repair), matching SAS to ~5e-4
(survival) / ~8e-3 (hazard; the looser hazard tolerance reflects the
near-singular 9-coefficient fit and the steep early-phase times). Adds a
header-driven .hzr_parse_sas_nomogram_mv() (parses the
BY-group “digital nomogram” whose rows each carry their own covariate
vector) and a shared .hzr_fit_avc_hmdeath()
helper.
Stratified HAZPRED calibration parity (Group A
fixture hs.death.AVC.hm1). New
test-sas-parity.R blocks reproduce the population-averaged,
stratified-by-COM_IV outputs from the same HMDEATH model:
(1) the observed-vs-expected “predict number of deaths” table – per
stratum, EXPECTED = sum of predicted cumulative hazard at each subject’s
own follow-up, PEXPECT = sum of predicted death probability, ACTUAL =
observed deaths (totals conserve events, 14.76 + 55.24 = 70), to ~5e-3;
and
.hzr_parse_sas_calibration() and
.hzr_parse_sas_strata_survival().hm.death.AVC stepwise documented as a
non-parity gap (Group A). The phase-aware forward
SELECTION SLE=0.2 SLS=0.1 fit’s final selected
model is the saved “HMDEATH” fit already verified by the
hm.death.AVC.deciles / hp.death.AVC.hm1 /
hm2 parity tests; its selection path cannot be
reproduced (SAS uses approximate variances during selection while R’s
full Hessian is near-singular here; SAS’s /I
/S flags are phase-level but R’s force_in is
phase-blind; R oscillates at p ~ slstay and lands in a worse basin – the
same divergence already documented for hm.deadp.VALVES).
test-sas-parity.R gains a regression-guard test that
exercises the multiphase phase-aware stepwise path end-to-end on real
data without asserting path parity; see
inst/dev/FIXTURE-GAP-LIST.md.
bs.death.AVC bootstrap documented as a
non-parity gap (Group A). SAS %HAZBOOT runs a
fresh stepwise selection on each bootstrap resample and reports a
variable-selection frequency; R’s hzr_bootstrap() resamples
and refits a fixed model (no embedded-selection mode), and
reimplementing the SAS procedure would inherit the documented
hm.death.AVC stepwise divergence.
test-sas-parity.R adds
.hzr_parse_sas_bootstrap() and asserts the SAS reference
selection frequencies in parseable form (so the parity test is
half-written for a future bootstrap-with-selection capability), plus a
regression guard that R’s fixed-model bootstrap runs on the cohort; see
inst/dev/FIXTURE-GAP-LIST.md.
Phase-specific covariate recovery tests (Phase
7d). New test-phase-specific-covariates.R confirms that
hzr_phase(formula = ~ ...) is correct, not just runnable:
simulation-based recovery tests verify that a covariate entered into one
phase recovers its true coefficient, that the same covariate carries
independent (here opposite-sign) effects across two phases, and that a
covariate confined to one phase does not leak into another. This is the
honest substitute for a SAS parity fixture and guards against the
“accepts the formal but never applies it” regression that has surfaced
before with weights and counting-process times.
Added fractional (non-integer) weight coverage to close the
roadmap 7a gap. Prior weight tests verified weighting only via integer
row duplication, which cannot express fractional
(e.g. inverse-probability) weights. The new tests assert the two
properties that define a correct per-row weighted log-likelihood: an
additive split (a row of weight a + b
equals two identical copies of weights a and
b) and linear scaling
(L(theta; c*w) = c * L(theta; w), gradient likewise, MLE
invariant), across the Weibull, exponential, and
multiphase-with-covariates paths.
Made the single-distribution weighted-fit tests exercise a real
fit. They previously omitted theta start values, so
hazard(fit = TRUE) took its unfitted branch and the
assertions compared NULL/NA vacuously; they
now supply starts and genuinely compare the weighted MLE to the
duplicated-row MLE.
Added interval-censoring coverage under the multiphase model
(roadmap 7c). The multiphase likelihood’s interval-/left-censored branch
had a working code path but no isolated test. New R-only
self-consistency invariants in
test-interval-censoring-multiphase.R verify the interval
contribution equals log(S(lower) - S(upper)), the
left-censored term equals log(1 - exp(-H(u))),
right-censoring stays -(H(stop) - H(start)) (including left
truncation), invalid bounds (lower > upper) yield
-Inf, integer weights match row duplication on interval
rows, and an interval-censored multiphase fit converges.
Added a SAS fractional-weight parity capture scaffold under
inst/extdata/weights-fixtures/ (roadmap 7a /
FIXTURE-GAP-LIST B5): a PROC HAZARD ... WEIGHT IPW
template, a deterministic non-integer weight dataset, a
.lst parser, and test-weights-sas-parity.R.
The parity test re-fits the SAS specification in R and compares
covariate estimates and log-likelihood; it skips when the capture
fixture is absent (as it is by default), so CI and installation are
unaffected until a SAS run is dropped in. R-side fractional-weight
correctness is already proven by the invariants above; this is the
drop-in external SAS confirmation.
hzr_bootstrap() no longer touches
.GlobalEnv directly. The 1.0.2
oldseed/on.exit()/assign(".Random.seed", ...)
save-restore wrapper added in 1.0.2 violated CRAN policy on writing to
.GlobalEnv and has been removed. When seed is
supplied the function simply calls set.seed(seed) (the
documented R API for seeded reproducibility); the
@param seed documentation now notes that the caller’s RNG
state is not restored on exit. With seed = NULL (the
default) the function does not call set.seed() at entry, so
it starts from the caller’s current RNG state; the bootstrap still
consumes random numbers and advances that state in the usual way..hzr_create_*_golden_fixture(), previously
R/golden_fixtures.R) have been moved out of the package to
data-raw/golden_fixtures.R. They are maintainer-only
helpers for regenerating the bundled inst/fixtures/*.rds
reference outputs and are not part of the installed package, so they are
no longer shipped, checked, or user-reachable. This resolves the
home-filespace concern at its root: the earlier fallback resolved to
system.file("fixtures", ...) — i.e. the installed package
directory — whenever the package was installed, so the 1.0.1 “falls back
to tempdir()” fix did not actually prevent writing to the
user library. The bundled .rds fixtures still ship and the
parity tests still read them via system.file()..hzr_generate_golden_fixture() (the C-binary reference
writer in R/parity-helpers.R, which shares a file with
test-time helpers and so was kept in the package) now takes a required
output_dir argument with no default path.seed = 42 literals from
the relocated generators; recorded fixture metadata reflects the actual
seed argument passed (NULL by default, so no
seed is set inside the function).hzr_bootstrap() no longer leaves the caller’s
random-number stream altered when seed is supplied: the
global .Random.seed is saved before set.seed()
and restored via on.exit(), matching the fixture
generators. Bootstrap reproducibility under a given seed is
unchanged.\value documentation to all exported functions
that were missing it: hazard(), coef.hazard(),
vcov.hazard(), print.hzr_calibrate(),
print.hzr_deciles(), print.hzr_gof(), and
print.hzr_kaplan().R/golden_fixtures.R) no
longer set a specific seed unconditionally. Generators now accept an
optional seed argument; when provided, the global RNG state
is saved and restored via on.exit().output_dir for fixture generators falls back to
tempdir() instead of the package source directory, keeping
the home filespace unmodified.predict.hazard() — Phase 4g of the development
plan lands. Two new arguments: se.fit = FALSE and
level = 0.95. When se.fit = TRUE, the return
value becomes a data frame with columns fit,
se.fit, lower, upper.
dH/dtheta, dexp(eta)/dtheta,
deta/dtheta); exponential / log-logistic / log-normal fall
back to numDeriv::jacobian on a per-call cumhaz
closure.hzp_calc_haz_CL.c / hzp_calc_srv_CL.c):
hazard and cumulative_hazard use log-scale
CLs; survival uses log(-log S) CLs (equivalent to
log-cumhaz) so 0 <= lower <= upper <= 1;
linear_predictor is symmetric on the natural scale.vcov, treating fixed parameters
as known-with-zero-variance.se.fit = FALSE (default) preserves
the pre-0.9.8 scalar-vector / decompose-data-frame return shape.Surv(start, stop, event) with any start > 0
is now accepted. The Weibull and multiphase log-likelihoods apply
H(stop) - H(start) to event and right-censored terms; the
trivial start = 0 case degenerates to H(stop)
and recovers the plain-Surv fit exactly. Splitting each row into
contiguous epochs preserves both the log-likelihood and the MLE to
optimizer tolerance (split-invariance).-d H(start)/d theta term per row (guarded at
start = 0). The multiphase analytic gradient computes
per-phase Phi_j(start) and its shape derivatives, then adds
+w_H_start * mu_j * dPhi_j(start) to each parameter’s
score; G3 phase derivatives at start use the same
finite-difference machinery as at stop.hazard()
guard that rejected counting-process
Surv(start, stop, event) with any start > 0
is gone.weights now supported for all
distributions — Phase 4e of the development plan lands. The
exponential, log-logistic, and log-normal likelihoods and their analytic
gradients now apply row weights to every censoring term (event,
right-censored, left-censored, interval-censored). The 0.9.5 guard in
hazard() that rejected weights for
dist %in% c("exponential", "loglogistic", "lognormal")
has been removed. Fits with integer weights reproduce the row-duplicated
fit to optimizer tolerance across all five distributions..hzr_conserve_events() and
.hzr_select_fixmu_phase() take an optional
weights argument; the multiphase optimizer threads it
through so per-phase cumulative hazards are summed on the same scale as
the (weighted) observed event count. CoE no longer auto-disables when
weights are non-uniform — the dimension reduction stays on and the MLE
matches the full-dim path.weights. .hzr_gradient_multiphase()
accepted neither weights nor its downstream equivalents:
the per-row score weights w_H / inv_h were set
to ±1 and the interval-censored finite-difference correction summed an
unweighted LL. Weighted multiphase fits therefore optimised a weighted
objective with an unweighted score; BFGS line search still converged
near the correct MLE but the final gradient norm did not go to zero. All
three paths now honour row weights, and the optimizer’s
gradient_fn wrapper (including the all-zero numeric
fallback and the CoE wrapper) forwards weights
consistently. Regression test covers weighted analytic vs numerical
gradient parity. Surfaced by Copilot review on PR #18.hzr_stepwise() runs forward, backward, or two-way stepwise
selection on an existing hazard fit using Wald p-values or
AIC deltas as the entry / retention criterion. Phase-specific entry is
supported for multiphase models: a covariate can enter one phase and not
another. Defaults match SAS PROC HAZARD
(SLENTRY = 0.30, SLSTAY = 0.20); AIC mode uses
ΔAIC < 0 uniformly. SAS-style MOVE
oscillation guard freezes variables that enter + exit more than
max_move times. Returns an object of class
c("hzr_stepwise", "hazard") with a $steps
selection trace, scope record, and elapsed timer. Implements the core
algorithm from C HAZARD stepw.c /
backw.c.weights
threading (dup-arg collision in the multiphase / Weibull closures,
positional-arg corruption in every distribution’s gradient call) made
every optimizer iteration error silently inside tryCatch.
Diagnosed and fixed via commit 73b4657.weights — both
.hzr_gradient_weibull() and the grad_internal
closure inside .hzr_optim_weibull() accepted
weights as a formal but did not apply it to the score
vector. The optimizer still converged via line search on the (weighted)
log-likelihood, but the gradient direction was wrong and the final
gradient norm did not go to zero. Both gradient paths now weight the
event indicator and cumulative hazard building blocks. Fits with integer
weights reproduce the equivalent row-duplicated fit to optimizer
tolerance.weights is now only accepted for
dist = "weibull" and dist = "multiphase". The
0.9.4 NEWS claimed weights were threaded through all
distribution-specific likelihoods; in fact the exponential,
log-logistic, and log-normal single-distribution paths accepted the
formal but never applied it, so the fit was silently unweighted.
hazard() now raises an explicit error when
weights is supplied with one of those distributions rather
than returning an unweighted fit. Full support for the remaining
single-dist paths is tracked in
inst/dev/DEVELOPMENT-PLAN.md Phase 4e..hzr_conserve_events() receives the
weighted event count as its target but sums per-phase cumulative hazards
across rows without applying weights, so Turner’s adjustment
comes out on a mismatched scale. The multiphase optimizer now detects
non-unit weights and skips the CoE dimension reduction, falling through
to the (correctly weighted) full-dimensional path. Fits are still
correct; they just don’t benefit from the one-parameter analytical
closed-form solve. Weighted CoE wire-up is tracked alongside the other
weights completion work in inst/dev/DEVELOPMENT-PLAN.md
Phase 4e.Surv(start, stop, event) with
start > 0 is no longer accepted by
hazard(). The 0.9.4 NEWS claimed each epoch contributed
H(stop) - H(start) to the likelihood, but downstream
likelihoods only read time_lower for interval-censored rows
(status == 2); counting-process rows (status
in {0, 1}) were silently scored with H(stop)
alone, so any fit with nonzero entry times was silently wrong.
hazard() now raises an explicit error. The trivial case
Surv(0, t, d) – equivalent to Surv(t, d) –
continues to work. Full wire-up of H(stop) - H(start) for
all distribution paths is tracked in
inst/dev/DEVELOPMENT-PLAN.md Phase 4f.weights argument
in hazard() applies Fisher weighting to the log-likelihood
for dist = "weibull" and dist = "multiphase".
Each observation’s contribution is multiplied by its weight, enabling
severity-weighted event analyses. Implements the SAS WEIGHT
statement. The original 0.9.4 entry claimed coverage of all
distribution paths; the 0.9.5 patch corrected the claim and fixed a
gradient wire-up bug in the Weibull path.Surv(start, stop, event) start-stop notation is parsed.
The original 0.9.4 entry claimed each epoch contributed
H(stop) - H(start) to the likelihood, but the downstream
likelihoods never applied the lower bound for counting-process rows; the
0.9.5 patch narrowed the feature to the trivial start = 0
case and added an explicit error for nonzero starts.hzr_deciles() — Decile-of-risk calibration function
comparing observed vs. expected event counts across risk groups with
chi-square GOF testing. Implements the SAS
deciles.hazard.sas macro workflow.hzr_gof() — Goodness-of-fit function comparing
parametric predictions against nonparametric (Kaplan-Meier) estimates
with observed vs. expected event counting. Implements the SAS
hazplot.sas macro workflow.hzr_kaplan() — Kaplan-Meier survival estimator with
logit-transformed confidence limits that respect the [0, 1] boundary,
interval hazard rate, density, and restricted mean survival time (life
integral). Implements the SAS kaplan.sas macro output
structure.hzr_calibrate() — Variable calibration function for
assessing functional form before model entry. Groups a continuous
covariate into quantile bins and applies logit, Gompertz, or Cox link
transforms. Supports stratification via the by parameter.
Implements the SAS logit.sas and logitgr.sas
macros.hzr_nelson() — Wayne Nelson cumulative hazard estimator
with lognormal confidence limits. Supports weighted events for
severity-adjusted repeated event analyses. Implements the SAS
nelsonl.sas macro.hzr_bootstrap() — Bootstrap resampling for hazard model
coefficients with bagging support (fractional sampling). Returns
per-replicate estimates and summary statistics (mean, SD, percentile
CI). Implements the SAS bootstrap.hazard.sas macro
workflow.hzr_competing_risks() — Competing risks cumulative
incidence using the Aalen-Johansen estimator with Greenwood variance.
Handles any number of competing event types. Implements the SAS
markov.sas macro.control = list(conserve = FALSE). Implements the core
algorithm from C HAZARD setcoe.c /
consrv.c.set.seed(NULL) that was actively
breaking determinism).vignette: key
across all 8 files.fit parameter documentation corrected to state default
is FALSE.hzr_phase("g3", ...)) now
fully integrated into the multiphase optimizer, Hessian, and prediction
pipeline.fixed = "shapes" parameter in hzr_phase()
allows fixing shape parameters during estimation (matching C/SAS HAZARD
workflow of estimating only log-mu scale parameters).summary.hazard() now correctly reports standard errors
when some parameters are fixed. Previously, anyNA(vcov)
rejected the entire variance-covariance matrix when fixed parameters had
NA entries.print.summary.hazard() coefficient table now shows the
correct label for G3 phases (was printing empty parentheses).print.summary.hazard() phase listing now uses the phase
name in CDF labels (e.g., “cdf (late risk)”) instead of hardcoded “early
risk”..) in CSV datasets are now
handled via na.strings = c("NA", ".") in
data-raw/make_data.R, preventing numeric columns from being
read as character.roxygen2::load_pkgload for
lazy data compatibility..lintr configuration.peaceiris/actions-gh-pages@v4.use-public-rspm: true to all CI workflows.lintr to Suggests.dist = "multiphase" with hzr_phase()
specification.hzr_decompos() parametric family implementing the
three-parameter temporal decomposition of Blackstone, Naftel, and Turner
(1986).avc,
cabgkul, omc, tga,
valves.hazard() API with predict(),
summary(), coef(), vcov() S3
methods.hzr_log1pexp,
hzr_log1mexp, hzr_clamp_prob).