Improving estimates of population status and trend with superensemble models

Published in Fish and Fisheries, 2017

Recommended citation: Anderson, SC, Cooper, AB, Jensen, OP, Minto, C, Thorson, JT, Walsh, JC, Afflerbach, J, Dickey-Collas, M, Kleisner, KM, Longo, C, Osio, GC, Ovando, D, Mosqueira, I, Rosenberg, AA, Selig, ER. (2017). "Improving estimates of population status and trend with superensemble models." Fish and Fisheries. 2017; 1-10


Fishery managers must often reconcile conflicting estimates of population status and trend. Superensemble models, commonly used in climate and weather forecasting, may provide an effective solution. This approach uses predictions from multiple models as covariates in an additional “superensemble” model fitted to known data. We evaluated the potential for ensemble averages and superensemble models (ensemble methods) to improve estimates of population status and trend for fisheries. We fit four widely applicable data-limited models that estimate stock biomass relative to equilibrium biomass at maximum sustainable yield (B/BMSY). We combined these estimates of recent fishery status and trends in B/BMSY with four ensemble methods: an ensemble average and three superensembles (a linear model, a random forest and a boosted regression tree). We trained our superensembles on 5,760 simulated stocks and tested them with cross-validation and against a global database of 249 stock assessments. Ensemble methods substantially improved estimates of population status and trend. Random forest and boosted regression trees performed the best at estimating population status: inaccuracy (median absolute proportional error) decreased from 0.42 – 0.56 to 0.32 – 0.33, rank-order correlation between predicted and true status improved from 0.02 – 0.32 to 0.44 – 0.48 and bias (median proportional error) declined from −0.22 – 0.31 to −0.12 – 0.03. We found similar improvements when predicting trend and when applying the simulation-trained superensembles to catch data for global fish stocks. Superensembles can optimally leverage multiple model predictions; however, they must be tested, formed from a diverse set of accurate models and built on a data set representative of the populations to which they are applied.

Download paper here