2.2 | Ecological niche models
For potential distributions of two langurs, we used Genotype–Environment Association (GEA) model selecting the important variables. The GEA method can identify adaptive loci by constructing the relationship between genetic and environmental data (Forester et al., 2018). To accomplish this, we used SNP data from Liu et al. (2020). This data set contained eight François’ langur, four from Guangxi Province and four from Guizhou Province, and eight White-headed langurs from Guangxi province. To identify the relationship between each locus and the bioclimate variables, we run a GEA model using Random Forest (RF), a machine-learning algorithm that creates multiple decision trees. SNPs were ordinated by principal components (PCs) in R on the scaled data and the resulting PCs were used in a parallel analysis (Forester et al., 2018; Horn, 1965). Then, a varimax rotation was applied to the PC axes maximizing the correlation between the axes and the SNPs. The resulting components and environmental variables were used as dependent and predictors variables, respectively. We retained the components those were significantly correlated with the constrained axis. In the end, outliers that correlated with these retained SNP components were eliminated. The variables selected by GEA model are showed in Figure 3.
We also considered actual ecological needs of these species. Given that non-human primates are sensitive to climate change, especially to changes in temperature and precipitation (Kosheleff & Anderson, 2009), temperature and precipitation were included in the model. Highly correlated variables were removed by permutation scores. In addition, we checked for collinearity among these bioclimatic variables by calculating the pairwise Pearson’s correlation coefficient (r ), and retained only one predictor when two or more predictors were highly correlated (i.e. |r | >0.70) (Dormann et al., 2013) (Supporting Information Figure S1). The variables used in our analyses are shown in Table S1.
Then we used Maxent model to predict the distribution of the two langurs under different periods. Maxent represents a presence-only machine learning algorithm that has been widely used to predict climatic suitability for target species (Elith et al., 2006; Phillips et al., 2006). For each species, we generated background points in the study area, and randomly selected 10,000 points as the background points for the model operation (Elith et al., 2011). We used the ENMeval R-package to build models with different feature classes and different regularization multipliers (Kass et al., 2021). The feature classes are combinations of linear, quadratic, hinge and threshold terms. The regularization multiplier ranges from 0.5 to 4 with an interval of 0.5. We used the ”blocks” option by dividing the study area into four blocks. The presence of the same amount per block comprised the environment model calibration data using three blocks. A fourth prediction block was added to validate the model. We specified spatial block partitions with different directions for the dataset of each species: four latitudinal blocks were created for François’ langur, and four longitudinal blocks were for White-headed langur. We repeated this process until each of the four blocks was used as validation data (Silva et al., 2020). We then selected the best settings from the candidate models based on the application of sequential criteria on performance indicators (Kass et al., 2020; Radosavljevic & Anderson, 2014; Zhang et al., 2021). Finally, we calibrated the model using the multivariate environmental similarity surface (MESS) values and built MESS maps (Elith et al., 2010) (Supporting Information Figure S2 & Figure S3).
We chose the cloglog output to obtain predictions, because the cloglog output provides stronger predictions than the logistic format, especially at higher values (Phillips et al., 2017). We used the 10% training presence threshold to convert continuous predictions into binary (bounded by 0 and 1). This is often an effective and conservative method for identifying suitable and unsuitable regions (Kramer-Schadt et al., 2013; Nazeri et al., 2012; Radosavljevic & Anderson, 2014). The identification of climate shelter, vulnerable regions and new increased suitable regions are conducive to the conservation and management of species (Guisan et al., 2013). We defined these regions as that:
1) climate shelter: the unchanged suitable regions from LIG to the current period;
2) climate change vulnerable regions: current suitable regions translated into unsuitable regions under future (2050) climate change scenario;
3) human activities vulnerable regions: the current suitable regions translated into unsuitable regions under the influence of human activities.
2.3| Estimation of effective population size
The Multiple Sequentially Markovian Coalescent algorithm (MSMC) (Schiffels & Durbin, 2014) was used to estimate effective population sizes through time using four individuals from each of the two species, with all of the data phased (Anchor the specific allele sequence on each copy of homologous chromosomes, the process called ‘phase’ here) using BEAGLE 4.1 with default settings (Browning & Browning, 2007). Although MSMC can handle unphased sites to some extent, we used only phased sites in order to achieve more accurate results. MSMC requires a high depth of coverage to ensure that heterozygous positions are called (Schiffels & Durbin, 2014), and therefore we used four individuals for François’ langur (~27x) and four individuals for White-headed langur (~30x) with high coverage in all further analyses (Supporting Information Table S2). We also generated a mask for mapability using the SNPable toolkit (http://lh3lh3.users.sourceforge.net/snpable.shtml; with k=35 and r=0.5) to exclude non-unique regions in the langur genome. We conducted all analyses using a mutation rate of 1.0×10−8 per generation per base pair, a generation time of 10 years (Liu et al., 2020), and the default parameters from the MSMC implementation (http://www.github.com/stschiff/msmc). We conducted the MSMC analysis following the instructions in the short guide to MSMC (https://github.com/stschiff/msmc/blob/master/guide.md).