2.2 | Ecological niche models
For potential distributions of two
langurs, we used Genotype–Environment Association (GEA) model selecting
the important variables. The GEA method can identify adaptive loci by
constructing the relationship between genetic and environmental
data
(Forester et al., 2018). To
accomplish this, we used SNP data from Liu et al. (2020). This data set
contained eight François’ langur, four from Guangxi Province and four
from Guizhou Province, and eight White-headed langurs from Guangxi
province. To identify the relationship between each locus and the
bioclimate variables, we run a GEA model using Random Forest (RF), a
machine-learning algorithm that creates multiple decision trees.
SNPs were ordinated by principal
components (PCs) in R on the scaled data and the resulting PCs
were used in a parallel analysis (Forester et al., 2018; Horn, 1965).
Then, a varimax rotation was applied to the PC axes maximizing the
correlation between the axes and the SNPs. The resulting components and
environmental variables were used as dependent and predictors variables,
respectively. We retained the components those were significantly
correlated with the constrained axis. In the end, outliers that
correlated with these retained SNP components were eliminated. The
variables selected by GEA model are showed in Figure 3.
We
also considered actual ecological needs of these species. Given that
non-human primates are sensitive to climate change, especially to
changes in temperature and precipitation (Kosheleff & Anderson, 2009),
temperature and precipitation were included in the model. Highly
correlated variables were removed by permutation scores.
In addition, we checked for
collinearity among these bioclimatic variables by calculating the
pairwise Pearson’s correlation coefficient (r ), and retained only
one predictor when two or more predictors were highly correlated (i.e.
|r | >0.70) (Dormann et al., 2013)
(Supporting Information Figure S1). The variables used in our analyses
are shown in Table S1.
Then
we used Maxent model to predict the distribution of the two langurs
under different periods. Maxent represents a presence-only machine
learning algorithm that has been widely used to predict climatic
suitability for target species (Elith et al., 2006; Phillips et al.,
2006). For each species, we generated background points in the study
area, and randomly selected 10,000 points as the background points for
the model operation (Elith et al.,
2011). We used the ENMeval
R-package to build models with different feature classes and different
regularization multipliers (Kass et al., 2021). The feature classes are
combinations of linear, quadratic, hinge and threshold terms. The
regularization multiplier ranges from 0.5 to 4 with an interval of 0.5.
We used the ”blocks” option by
dividing the study area into four blocks. The presence of the same
amount per block comprised the environment model calibration data using
three blocks. A fourth prediction block was added to validate the model.
We specified spatial block partitions with different directions for the
dataset of each species: four latitudinal blocks were created for
François’ langur, and four longitudinal blocks were for White-headed
langur. We repeated this process until each of the four blocks was used
as validation data (Silva et al., 2020). We then selected the best
settings from the candidate models based on the application of
sequential criteria on performance indicators (Kass et al., 2020;
Radosavljevic & Anderson, 2014; Zhang et al., 2021). Finally, we
calibrated the model using the multivariate environmental similarity
surface (MESS) values and built MESS maps (Elith et al., 2010)
(Supporting Information Figure S2 & Figure S3).
We chose the cloglog output to obtain predictions, because the cloglog
output provides stronger predictions than the logistic format,
especially at higher values (Phillips et al., 2017). We used the 10%
training presence threshold to convert continuous predictions into
binary (bounded by 0 and 1). This is often an effective and conservative
method for identifying suitable and unsuitable regions (Kramer-Schadt et
al., 2013; Nazeri et al., 2012; Radosavljevic & Anderson, 2014). The
identification of climate shelter, vulnerable regions and new increased
suitable regions are conducive to the conservation and management of
species (Guisan et al., 2013). We defined these regions as that:
1) climate shelter: the unchanged suitable regions from LIG to the
current period;
2) climate change vulnerable regions: current suitable regions
translated into unsuitable regions under future (2050) climate change
scenario;
3) human activities vulnerable regions: the current suitable regions
translated into unsuitable regions under the influence of human
activities.
2.3| Estimation of
effective population size
The Multiple Sequentially Markovian Coalescent algorithm (MSMC)
(Schiffels & Durbin, 2014) was used to estimate effective population
sizes through time using four individuals from each of the two species,
with all of the data phased (Anchor the specific allele sequence on each
copy of homologous chromosomes, the process called ‘phase’ here) using
BEAGLE 4.1 with default settings (Browning & Browning, 2007). Although
MSMC can handle unphased sites to some extent, we used only phased sites
in order to achieve more accurate results. MSMC requires a high depth of
coverage to ensure that heterozygous positions are called (Schiffels &
Durbin, 2014), and therefore we used four individuals for François’
langur (~27x) and four individuals for White-headed
langur (~30x) with high coverage in all further analyses
(Supporting Information Table S2). We also generated a mask for
mapability using the SNPable toolkit
(http://lh3lh3.users.sourceforge.net/snpable.shtml;
with k=35 and r=0.5) to exclude non-unique regions in the langur genome.
We conducted all analyses using a mutation rate of
1.0×10−8 per generation per base pair, a generation
time of 10 years (Liu et al., 2020), and the default parameters from the
MSMC implementation
(http://www.github.com/stschiff/msmc).
We conducted the MSMC analysis following the instructions in the short
guide to MSMC (https://github.com/stschiff/msmc/blob/master/guide.md).