Modeling methods
Three methods were used for building models: boosted regression trees (BRT), generalized linear models of the binomial family (GLM) and maximum entropy (Maxent) (Hijmans & Graham, 2006; Elith, Leathwick, & Hastie, 2008; Franklin, 2010). Models used field survey presence and 200 randomly generated pseudoabsence background points (Franklin, 1995; Franklin, 2010; Elith, Kearney, & Phillips, 2010; Elith & Franklin, 2013; Phillips & Elith, 2013, Guillera-Arroita et al., 2015).
BRT is an iterative machine learning optimization method, in which the deviance residuals from a prior decision tree are used as the data for the next step (called “boosting”); the decision tree building process continues until residual deviance is no longer decreased by iterations (De’ath, 2007; Franklin, 2010). Decision trees, the underlying algorithm of BRT, also known as classification and regression trees, perform well with both continuous and categorical variables, and, unlike with GLM, for example, they are robust to a lack of independence among predictors (De’ath, 2007; Elith & Leathwick, 2009; Elith & Leathwick, 2017; Albuquerque et al., 2018).
GLM is a well known regression method that uses maximum likelihood as the measure of the contribution of a variable to a prediction of the “state” of a dependent variable, in this case the binary outcome of presence/absence (Nelder & Wedderburn, 1972; Guisan, Edwards, & Hastie, 2002).
Maximum entropy (Maxent) is a machine learning method that employs multinomial logistic regression to estimate the probability of the distribution of a species according to the “maximum entropy” of the distribution, i.e., the most uniform distribution of a species possible given the limits imposed by the predictor variables (Phillips, Anderson, & Schapire, 2006; Elith et al. 2011, Phillips, Anderson, Dudík, Schapire, & Blair, 2017).