FIGURE 2  Independent environmental variables used to model the potential geographical distribution of each species in Northwest Yunnan

2.3.2 Construction of the GWR model

Fotheringham et al. (1996) proposed the GWR model based on the thought of local smoothing and the summary of local regression analysis and variable parameter research. GWR is an extension and improvement of the traditional global linear regression (e.g., OLS) which adds geographical locations (i.e., Spatial factors) to the regression parameters, and also considers the spatial weights of adjacent points (Han et al., 2016). In addition, GWR model is a local regression model, which can profoundly explain the spatial non-stationarity relationship between response variables and explanatory variables by decomposing global parameters into local parameters (Tripathi et al., 2019a). The regression equation can be developed as (Han et al., 2016; Tripathi et al., 2019b):
\begin{equation} \mathrm{y}_{\mathrm{i}}\mathrm{\ }\mathrm{=}\mathrm{\ }\mathrm{\beta}_{\mathrm{i}\mathrm{0}}\mathrm{\ }\mathrm{(}\mathrm{u}_{\mathrm{i}}\mathrm{,}\mathrm{v}_{\mathrm{i}}\mathrm{)}\mathrm{\ }\mathrm{+}\mathrm{\ }\sum_{\mathrm{k}\mathrm{\ }\mathrm{=}\mathrm{\ }\mathrm{1}}^{\mathrm{p}}{\mathrm{\beta}_{\mathrm{\text{ik}}}\mathrm{\ }\mathrm{(}\mathrm{u}_{\mathrm{i}}\mathrm{,}\mathrm{v}_{\mathrm{i}}\mathrm{)}\mathrm{\ }\mathrm{x}_{\mathrm{\text{ik}}}\mathrm{\ }\mathrm{+}\mathrm{\ }\mathrm{\varepsilon}_{\mathrm{i}}}\nonumber \\ \end{equation}
Where k = 1, p explanatory variables,εi denotes the random error term at positioni . In addition, (ui ,vi ) represents the geographic coordinate or spatial location the of each observation,βi 0 (ui ,vi ) is the intercept at position i ,βik (ui ,vi ) denotes the local regression coefficient at position i . Whenβ 1k = β 2k = … = βnk , it indicates that the GWR model is transformed into an ordinary linear regression model. In this study, the potential species richness within each grid were used as dependent variables and environmental factors were used as independent variables to investigate the explanation capabilities of different categories of environmental parameters on the potential geographical distribution patterns of species.
According to the Tobler’s first law (TFL) of geography (Tobler, 1970), the basic principle of the GWR model to calculate the weight is “the closer the distance, the higher the assigned weight; on the contrary, the lower the assigned weight (Fotheringham et al., 2002)”. Therefore, the weight can be calculated by a monotonically decreasing function in space distance with [0, 1] as the value range. This type of function is called as kernel function (Lu et al., 2020). The GWR method usually employs a Gaussian model as a weight function, where bandwidth is a function that describes the weight and the distance and is considered as an important control parameter in weight calculation (Gao et al., 2019). The function is expressed as (Wang et al., 2020):
ωij = exp\(\left(-\frac{\mathrm{d}_{\mathrm{\text{ij}}}^{\mathrm{\ }\mathrm{2}}}{\mathrm{b}^{\mathrm{2}}}\right)\)
Where ωij denotes the distance weight of observation location i and j , dijis the Euclidean distance between location i and j , and b represents the bandwidth. When the distance between location iand j is larger than b, ωij is equal to 0; when the distance between location i and j is equal to 0,ωij is equal to 1.

2.4 Model evaluation

2.4.1 Evaluation of the MaxEnt model

We adopted AUC value to evaluate the fitting accuracy of the MaxEnt model. The model fitting precision can be evaluated as failed if AUC value is between 0.50 and 0.60, poor if AUC value is between 0.60 and 0.70, fair if AUC value is between 0.70 and 0.80, good if AUC value is between 0.80 and 0.90, and excellent if AUC value is between 0.90 and 1.00 (Phillips et al., 2006; Zhang et al., 2019). In addition, the suitability maps were calculated employing the logistic output of the Maxent, and the range of habitat suitability index (HSI) value we had obtained was [0, 1]. According to a large number of previous studies and the expert experience method, we reclassified HSI value into four grades by Natural Breaks in ArcGIS 10.4 software: 0-0.20 is low, 0.20-0.40 is medium, 0.40-0.60 is high, and 0.60-1.00 is optimal (Convertino et al., 2014; Yi et al., 2017). In order to conservatively estimate the suitable potential geographical distribution area of species, we considered grids with the HSI value larger than or equal to 0.40 as the suitable potential distribution area.

2.4.2 Evaluation of the GWR model

In this study, we used the package ‘spgwr’ of R software to select bandwidth by adopting Gaussian function and employed Akaike Information Criterion (AIC) to confirm the optimal bandwidth. Generally, regression residual is an evaluation value of the fitting goodness of the model, including residual sum of squares (RSS) and residual standard deviation (Sigma), and these two values should be as small as possible. In addition, R 2 and AIC value can also reflect fitting goodness of the model. The higher R 2, and the lower AIC value, indicating the better fitting effect of the model (Li et al., 2017; Liu et al., 2019). When the difference in the AIC value (∆AIC) of the two models is greater than three, then the model with smaller AIC value has better fitting effect (Han et al., 2016; Xue et al., 2020).