No provider is best everywhere
Any single weather provider has a skill surface that peaks somewhere and tails off everywhere else. Stormglass crushes Mediterranean coastal marine but tails off inland; Open-Meteo excels at atmospheric variables in temperate latitudes but struggles with tropical convection; CMEMS is the EU marine ground truth but offers limited geographic coverage. The naive consensus (arithmetic mean of providers) inherits everyone's weakness equally — which means it's mediocre everywhere.
Four sources, mostly independent
Strong on Mediterranean + Atlantic coastal cells; weaker on inland sailing lakes and on wind direction at very short horizons (where it relies on a derived model).
Strong on land-based atmospheric variables (temp, wind 10m, precipitation, pressure). Free + always-on, but provider-derived UV and marine variables aren't its strength.
EU-funded marine ground truth. Strong on sea state + currents in EU coastal waters; lag of ~24h on real-time analyses; geographic gaps outside the European bounding box.
Strong on North-American + global atmospheric forcings. Particularly good at synoptic-scale convection (paragliding XC); weaker on European coastal microclimate.
Inverse-variance pooling, per cell
Each provider's per-cell error variance is estimated against ERA5 reanalysis (the de-facto European weather ground truth) over a rolling 90-day window. Cells with σ²ₚ below the cohort median get their relative weights boosted; cells with high variance get attenuated. The blend itself is the standard inverse-variance pool:
forecast_consensus(c) = Σₚ (wₚ(c) · forecast_p(c))
where wₚ(c) = (1/σ²ₚ(c)) / Σⱼ (1/σ²ⱼ(c))
σ²ₚ(c) ← rolling 90d MSE vs ERA5 for provider p, cell c
c = (activity-family, region, variable, horizon_h)The output is mathematically optimal under the assumption that provider errors are zero-mean and approximately independent. When the assumption breaks (e.g. two providers share the same underlying NWP), we deflate their joint weight. The full provider-skill estimate matrix is regenerated weekly and made available to commercial-tier consumers as a downloadable artefact.
Why four dimensions, not one
The skill estimate is keyed on a four-dimensional cell, not on provider × spot. Cell granularity matters: it's the difference between "Stormglass is good in Spain" and "Stormglass is good at wave_period in Western Mediterranean coastal cells at horizons up to 48h". The actuary's loadings need the second.
Stormglass for marine, Open-Meteo for atmospheric, CMEMS for ocean reanalysis — each provider has a structural strength region.
European coastal vs North American; tropical vs alpine; coastal vs offshore. The same provider's skill varies 2-3× across regions.
Wind speed vs wind direction vs wave height vs precipitation. One provider may nail wind but blow on precip.
0-24h vs 24-72h vs 72-168h. Different physics dominate at different horizons — a nowcasting-capable provider beats a synoptic one at h=0-6h.
The weekly refit
Every Sunday 03:00 UTC the refit job pulls the prior week's forecasts + ERA5 reanalyses, computes per-cell MSE, refits the weight matrix, and hot-reloads the consensus provider at the API edge (no restart). Cells with fewer than 30 paired forecast-observation samples retain their previous weights rather than picking up noise. Every refit run is logged in provider_skill_estimates with a sample count + a SHA-256 of the input cohort so a future audit can replay any historical weighting decision.
Where you can read the weights
The per-cell skill weights are commercial-tier — they're the competitive moat we don't open-source. If you're a parametric MGA defending a premium loading or an academic working on weather-model ensembling, the partnerships desk routes you to a DPA-gated export.