ALL SYSTEMS · NOMINAL
UTC --:--:--
Docs·Concepts·Confidence

How confidence is computed

Every score carries both a scalar confidence (0–1) and a rich confidenceDetail block. The scalar is a deterministic fold of the detail — read the detail when you want to show users why a verdict is trustworthy.

The scalar

confidence is a number between 0 and 1. Closer to 1 means the engine has high agreement with itself: tight ensemble spread, a mature profile, a calibrated sub-spot, low drift. Below 0.5 means the score is workable but the underlying signal is sparse or uncertain; treat it as a directional hint, not a verdict you'd bet a session on.

The detail block

confidenceDetail is a discriminated union by mode: forecast, historical or climate. The fields you get depend on which scoring surface produced the response.

{
 "mode": "forecast",
 "horizonH": 36, // forecast lead time in hours
 "ensembleSpread": 0.18,
 "profileMaturity": "calibrated",
 "hierarchical_calibration": 0.95,
 "drift_flag": "none"
}

Forecast mode

Returned by /v1/score, /v1/score/series, /v1/score/multi. Factors:

  • Horizon — confidence decays with forecast lead time. A score for "in 6 hours" is more confident than "in 7 days".
  • Ensemble spread — when ensemble scoring is on, narrow distribution = high confidence; wide distribution = chaotic weather state, low confidence.
  • Profile maturityprovisional (0.6), reviewed (0.80), calibrated (0.95). A fresh profile with no operator outcomes scores lower confidence than one calibrated against hundreds of paired outcomes.
  • Hierarchical calibration — how locally the spot resolved. Sub-spot with n≥100 outcomes scores 1.0. Falling back to cluster or regional level scores 0.85–0.95.

Historical mode

Returned by /v1/score/historical. Confidence reflects sample size + reanalysis coverage — older years (pre-1979 for marine, pre-1940 for atmospheric) carry lower confidence even when the score distribution looks tight.

Climate mode

Returned by /v1/projections. Per-decade entries carry their own confidence detail folded from CMIP6 ensemble spread + the sub-spot bias correction's posterior. Long-horizon projections are intrinsically less confident than near-term forecasts; that's encoded explicitly here rather than hidden in the scalar.

drift_flag

When the drift monitor has flagged a forecast-skill regression for the cell your request lands in, confidenceDetail.drift_flag is set to "watch", "warning" or "critical". Show a UI hint and consider rolling back to a previous calibrated curve — the engine knows its own skill is sliding.