A formula the field doesn't have
An outcome is a triple: rider r, condition (metric value x at sub-spot s), binary outcome y (good session / not). Classic Item Response Theory (IRT) models discrete test items; we adapt it to a continuous physical condition space — wind speed is an axis, not a question.
The two-parameter latent factorisation, generalising today's single suitability curve, is σ(a · (θ_r − δ(x, s))) where θ_r is the latent skill of rider r, δ(x, s) is the latent difficulty of the condition, and a is the discrimination scalar.
P(y=1 | r, x, s) = σ(a · (θ_r − δ(x, s)))
Latent skill — one scalar per pseudonym, identified by N(0,1) location/scale.
Difficulty function — anchored to the physics expert curve as its prior. The novel object.
Discrimination — how sharply the condition separates skill levels.
Property of the formula: integrating θ out under the population skill distribution recovers exactly today's single suitability curve. The new model is strictly more general than the calibrators that came before — it adds a dimension, it does not fork the engine.
The difficulty atlas — visualised
Existing observation networks — CMEMS marine, EUMETNET atmospheric, EUCLID lightning, EEA air quality, national avalanche services — cover the meteorological side. What none of them hold is what δ measures: the intrinsic difficulty of a condition, separated from the population skill mix that otherwise muddles it.
Two spots with identical suitability curves can have radically different intrinsic difficulty, masked by different clienteles. Until now, you could not tell them apart. The atlas is the first time outdoor activity suitability is treated as a formal, measurable construct.
Identified, reproducible, deterministic, private
θ_r and δ(x) separate only when the rider × condition incidence graph is connected. Below threshold, the model auto-disables and the single-curve fallback takes over — no over-claim.
Every fit is pinned to a SHA-256 cohort hash. Append-only history: a cited posterior can be re-derived from the cohort the hash references — without our cooperation.
Marginal Maximum Likelihood with Gauss-Hermite quadrature — same seed, same result. Full Bayes is documented as the future upgrade for richer uncertainty bands.
Per-pseudonym posteriors are Tier-1 data → hard delete on opt-out, across every cohort hash. The difficulty atlas is anonymous (a property of the spot) — survives all deletion requests.
Five operational consequences
Funding posture
Horizon Europe, Interreg and CMEMS methods calls become applicable in a way they weren't before. The synthetic-recovery proof makes the methodology defensible with zero proprietary data — submittable to arXiv this quarter.
Academic partnerships
The outreach register changes register: instead of "we have a B2B API, want to use the data?", we approach with "we have a new methodology — continuous-item IRT adapted to weather — and a publishable proof. Want to co-author?".
Investment narrative
Goable separates from "deeptech ML for outdoor sports" (crowded) into "an engine that synthesises peer-reviewed physics AND produces measurements physics doesn't" (uncopyable). The moat shifts from feature count to what only Goable can publish.
A new product surface
Skill-conditioned scoring + the difficulty endpoint ship as Pro+ features. Surfaces every operator can sell against — and a rating no competitor offers, anchored to a quantity no one else even measures.
Underwriting credibility
Parametric leisure-weather underwriting needs to price intrinsic spot risk, not the muddled suitability of clientele. The difficulty atlas is the credibility primitive for the next horizon.
Synthetic recovery — defensible with zero proprietary data
The methodology stands on its own through a synthetic recovery proof: we generate a cohort with KNOWN θ (per rider) and KNOWN δ(x), fit the model, and confirm it recovers ground truth within tolerance. Single command, runs in under a minute on a laptop, no real data touched — the artifact academic reviewers and prospects inspect.
On the reference cohort (80 riders × 30 paired outcomes, a_true=2.0) the fit recovers θ at correlation 0.96 against truth, locates the difficulty minimum within ±3 knots of the true 18-knot peak, and the held-out Brier Skill Score against the single-curve baseline is +0.33 — skill-conditioning earns its complexity.
$ uv run goable-calibration latent-demo --out /tmp/demo
=== Inverse Suitability — synthetic two-factor cohort ===
{
"theta_recovery_corr": 0.96,
"difficulty_min_knots": 16.7,
"discrimination_a": 2.16,
"marginal_matches": true,
"held_out_bss": 0.33,
"identification_ok": true,
"n_train": 2400
}The arXiv direction
"Inverse Suitability: Identifying Condition Difficulty and Rider Skill from Behavioural Outcomes via Continuous-Item Response Theory."
Target communities: psychophysics / IRT methodologists (education- measurement transferred to a new domain), environmental decision science, and the marine + alpine meteorology groups already in Goable's research outreach register.