FIGURE 1 Map of the study area. (a) The location of Northwest
Yunnan in China; (b) The topographic map of Northwest Yunnan and the
distribution of national nature reserves (NNRs) and provincial nature
reserves (PNRs) in this region
Combined with the results of our previous research, a total of 114 key
higher plant species in Northwest Yunnan were suggested (Ye et al.,
2020a). The information used to construct the dataset (e.g., Taxonomic
level, threatened level and geo-referenced
records)
were obtained from field survey in nearly a decade and main virtual
herbarium in China (for more details, see Ye et al., 2020a). In this
study, we selected a total of 25 species (including 314 distribution
records) from 114 key higher plant species (comprises 941 geo-referenced
records) (see selection standards below). The selection was based on the
combination of the following standards: (1) occurrence records: in order
to improve the accuracy of the MaxEnt model prediction as much as
possible,
the
number of species distribution records should not less than four;
(2) simulation accuracy: MaxEnt
model should have good or excellent simulation accuracy for included
species (for more details, see section 2.4.1).
2.2 Environmental variables
Based on previous studies (Nieto et al., 2015; Ștefănescu et al., 2017;
Zhang et al., 2019; Liu et al., 2019), we initially selected 24
environmental variables that may affect species distribution to model
the current potential geographical distribution patterns (Table 1).
Above all, we divided these variables into five groups according to
their
categories.
After that, 24 environmental factors were resampled and reprojected to
an equal-area grid system with the same spatial resolution (0.05° ×
0.05°) as species richness (Wang et al., 2018). Then we employed the
ArcGIS 10.4 software (Esri; Redlands, California, USA) to extract the
raster data of environmental variables.
In order to avoid multicollinearity of environmental parameters that
might result in model over-fitting, we calculated Pearson correlation
coefficient between each variable with the help of R 3.5.2 software
(https://www.r-project.org/). After performed a multicollinearity test,
we finally obtained 11 independent environmental variables (r< 0.8) to model the potential geographical distribution of
each species (Zhang et al., 2019; Mukherjee et al., 2020) (Figure 2).
2.3 Model construction
We constructed two models in this research: one was for predicting the
potential geographical distribution area of species; the other was for
analyzing the main environmental factors influencing the potential
distribution of species.
2.3.1 Construction of the MaxEnt
model
Phillips et al. (2004) developed the MaxEnt model based on the theory of
maximum entropy, which was applied to the study of simulating species
distribution (Phillips et al., 2006; Zhang et al., 2011). The model can
combine species occurrence localities and environmental parameters to
predict the habitat suitability, and then explores the possible
distribution of the species in the study area (Phillips et al., 2006;
Phillips & Dudík, 2008; Zhang et al., 2019).
In this study, the latitude and longitude of species distribution sites
and the independent environmental factors in Northwest Yunnan were
simultaneously imported into the MaxEnt model (Version 3.3.3k) to
construct the correlation function between species and the environment.
Usually, the prediction results of the
MaxEnt
model are related to some set parameters, such as the max number of
background points (BC), regularization multiplier (RM), and feature
combination (FC) (Zhu et al., 2018). MaxEnt is applied to run with the
following modeling regulations: (1) for species with < 10
distribution records linear features were applied; (2) for species with
10-14 records quadratic features were utilized, and; (3) species with
> 15 records hinge features were employed (Zhang et al.,
2012; Zhang et al., 2017). In this research, we set the RM value to
[0.5, 3], the step size was 0.5; the BC value set as [5000,
15000], the step size was 5000. After that, we applied the linear,
quadratic and hinge features to construct the MaxEnt model,
respectively. In addition, 75%
of species distribution locations were randomly selected as training
data to build the model, and the remaining 25% of the species
distribution locations were used as testing data for model validation
(Guan et al., 2018; Zhang et al., 2019). The maximum iterations was set
as 500, and the number of replicate runs was set as 10 or 20.