Compute quantitative predictive quality metrics that summarize agreement
between observed and simulated data in a VPC. QPC statistics numerically encode
features typically assessed visually (coverage, deviation, trend, and sharpness),
and include a composite qpc_score (lower is better) suitable for automated
model comparison and optimization.
Usage
qpcstats(
o,
alpha = 0.05,
w = c(med_cov = 0.35, tail_cov = 0.2, mae = 0.15, drift = 0.1, sharp = 0.1, interval =
0.1),
sharp_ref = NULL,
interval_ref = NULL,
...
)Arguments
- o
A
tidyvpcobjproduced byvpcstats(typically usingbinlessVPC summaries).- alpha
Numeric. Miscoverage level for interval scoring (default
0.05corresponds to a 95% prediction interval).- w
Named numeric vector of weights used to combine component penalties into
qpc_score. Names must include:med_cov,tail_cov,mae,drift,sharp,interval. Lowerqpc_scoreis better.- sharp_ref
Numeric or
NULL. Reference value used to scale the sharpness (interval width) penalty. IfNULL(default), a self-normalizing bounded transform is used so that a single VPC can be scored without external calibration. Setsharp_refwhen you need scores to be comparable across a population of models (e.g., Darwin searches, benchmarking across multiple datasets/runs).- interval_ref
Numeric or
NULL. Reference value used to scale the interval-score penalty. IfNULL(default), a self-normalizing bounded transform is used. Setinterval_reffor cross-model or cross-run comparability (e.g., Darwin optimization or evaluation studies).- ...
Additional arguments (reserved for future extensions).
Value
Returns tidyvpcobj with an additional qpc.stats
data.table containing QPC summary metrics and qpc_score.
Details
When should I set sharp_ref / interval_ref?
Single-model / single-VPC scoring (default): leave both as
NULL. This produces stable, bounded penalties without requiring any prior run for calibration.Population scoring / optimization: provide references when comparing many models (e.g., Darwin search) where you want the sharpness and interval-score penalties to be on a consistent scale across models/runs. A common choice is to compute
sharp_refandinterval_reffrom a representative run (e.g., median or 75th percentile values across all evaluated models).
Examples
if (FALSE) { # \dontrun{
vpc <- observed(obs_data, y = DV, x = TIME) %>%
simulated(sim_data, y = DV) %>%
binless(optimize = TRUE) %>%
predcorrect(pred = PRED) %>%
vpcstats()
# Default: single-model scoring (no calibration required)
vpc <- qpcstats(vpc)
# Population scoring (e.g., Darwin run): anchor penalties for comparability
vpc <- qpcstats(
vpc,
sharp_ref = 0.15,
interval_ref = 2.5
)
vpc$qpc.stats
} # }