2.4 Distribution modeling of P. villosa
In order to predict the impact that Quaternary climatic oscillation might have on the geographic distribution change of P. villosa , we employed an ecological niche modelling (ENM) approach to evaluate the potential distribution of P. villosa at the Last Inter-Glacial (LIG, ~ 120,000 - 140,000 years before present), the Last Glacial Maximum (LGM, ~ 21,000 years before present), the present and future times (2050s and 2070s), respectively. In addition to the distribution records of our field surveys, we also collected GPS data from the Chinese Virtual Herbarium (CVH, http: //www.cvh.ac.cn), Global Biodiversity Information Facility (http: //www. gbif.org), China National Specimen Information Infrastructure (http://www.nsii.org.cn) and Specimen Resources Sharing Platform for Education (http://mnh.scu.edu.cn/main.aspx) for P. villosa . In total, after removing duplicate and ambiguous records, we used 155 localities to generate spatial distribution models for P. villosa(Table S2). To improve abilities in establishing high-resolution predictions and identifying the critical factors influencing the species’ distribution, we obtained 19 bioclimatic variables and three geographic factors, such as altitude, slope and aspect, at 2.5 arc-min resolution from WorldClim database (Hijmans, Cameron , Parra, Jones, & Jarvis, 2005, www.worldclim.org). The future climate data involved in two emission scenarios of representative concentration pathways (RCP8.5 and RCP2.6) with the CCSM4 model (Van et al., 2011). We excluded highly correlated variables according to Spearman’s correlation test (Peterson & Nakazawa, 2008). Specifically, we selected the variables with a relative contribution score ≥ 0.8 or a correlation of < 0.75 compared to other variables. Based on the outcome of Spearman’s, we retained the eleven variables with the lowest correlations to build a maximum entropy model for the habitat ofP. villosa using. Subsequently, we generated this model representing the potential distribution of P. villosa in environmental space in MaxEnt 3.3.3k (Phillips, Anderson, & Schapire, 2006; Phillips & Dudik, 2008). Within MaxEnt, we performed modeling with 75% of localities randomly selected for training and 25% selected for testing 500 times independently to ensure reliable results, and we evaluated model performance using the area under the curve (AUC) of receiver operating characteristic (ROC). The value of AUC ranges between 0 (randomness) and 1 (exact match), and the value above 0.9 indicated good performance of the model (Swets, 1988). Additionally, we projected the predicted geographic ranges of species based on the ENMs using ArcGIS 10.2. In particular, we divided suitable habitat into four classes: highly suitable habitat (0.5 ≤ P ≤ 1.0), moderately suitable habitat (0.3 ≤ P < 0.5), poorly suitable habitat (0.1 ≤P < 0.3) and unsuitable habitat (0.0 ≤ P< 0.1).
In order to measure the niche similarity between populations occurring in groups, we calculated Schoener’s D (Schoener, 1968) and standardized Hellinger distance (calculated as I ) in ENMTools 1.3 (Warren, Glor, & Turelli, 2008, 2010). We obtained the null distribution of niche models in the identity test based on 1000 pseudo replicates generated by random sampling from the data points pooled for each pair of cluster. We determined measures of niche similarity (D and I ) by comparing with null distributions drawn from pooled occurrences retaining original cluster size, and we drew histograms of frequency distributions using R 2.13 (http://www.r-project.org/).