Optimal model selection for Maxent: a case of freshwater species
distribution modelling in Bhutan, a data poor country
Abstract
Maxent is commonly used species distribution modelling (SDM) program due
to its better performance over other SDM programs. But model complexity
and selecting optimal models are two important concerns for Maxent
users. In order to help advance the field we built 44 sets of models by
combining 11 regularization multipliers and four feature classes for 10
fish and 28 odonate species of Bhutan with small occurrence data. We
then selected optimal models using four sequential optimal model
selection approaches: two ORTEST approaches which
combined threshold dependent test omission rate (OR) followed by area
under receiver operating curve for test data (AUCTEST),
and two AUCDIFF approaches that combined OR followed by difference
between training AUC and AUCTEST
(AUCDIFF) and then AUCTEST. We then
screened for ecologically plausible binary suitable/unsuitable model for
each species among the optimal models selected by the sequential
approaches or from the remaining models using expert knowledge (EXP
approach). We then compared different model features and the predicted
binary habitat of the optimal models selected by the five approaches.
Models selected by ORTEST approaches matched better with
ones selected by EXP approach despite them selecting more complex models
compared to AUCDIFF approaches. Further, models selected
through AUCDIFF approaches overpredicted the habitat more often than the
models selected through ORTEST approaches when compared
to models chosen by EXP approach. We recommend use of
ORTEST approaches for model selection either as the
first line of model screening or by their own when less restrictive
thresholds are used to produce binary habitat maps as we did here.
First, this would reduce time required for expert screening of multiple
models for ecologically plausible models when many species are studied.
Second, when used alone, ORTEST approaches can avoid
either selecting models that under predict or over predict the suitable
habitat.