Genetic diversity and effective population size
We removed individuals that were identified as first-degree relatives (parent-offspring or full siblings) according to the KING v2.1.4 software (Manichaikul et al., 2010), i.e. those with kinship coefficients ≥ 0.18. Using the unlinked SNPs with 10% missing data, we first calculated pairwise kinship values and identified putative family groups with the KING software and then ran a PC-AiR analysis with GENESIS v2.2.2 (Conomos, Reiner, Weir & Thornton, 2016) in R (R Core Team 2017, applies to all subsequent use of R) to identify an “unrelated” subset of individuals. We used GenAlEx v6.503 to estimate the power of the SNP dataset to differentiate individuals by calculatingPIDsib , the probability of two individuals having identical genotypes assuming siblings are present in the data (Waits, Luikart, & Taberlet, 2001).
To estimate genetic diversity, we calculated average expected and observed heterozygosity (He andHo ) and the inbreeding coefficient (FIS ) with VCFtools v0.1.16 using our SNPs with no missing data. We also concatenated FASTA alignments of UCE sequence pseudo-haplotypes end-to-end for all individuals in the unrelated dataset and used the maximum composite likelihood method to calculate nucleotide diversity (\(\pi\)) in MEGA v7.0.26 (Kumar, Stecher, & Tamura, 2016).
We estimated effective population sizes (Ne ) using the linkage disequilibrium model with random mating (Waples & Do, 2008) implemented in NeEstimator v2.1 (Do et al., 2014). We report estimated N e values using ‘Lowest Allele Frequency Used’ 5% and 95% confidence intervals generated by the ‘Parametric method’ for unrelated individuals for each population cluster identified by STRUCTURE separately.