2.4.1. Structure and principal component analyses
ADMIXTURE v1.3.0 (Alexander & Lange, 2011) was run using the set of
5,702 SNPs and all samples to evaluate the genetic structure among
populations. ADMIXTURE was run using the
“unsupervised model” (without
using sampling locality as a prior). With this model setting, ADMIXTURE
evaluates Q- values to ancestral genetic clusters (K) in different
runs, ranging from one (K=1) to the number of sampled populations + 1
(in our case K=21), and gives an optimal K as the one achieving the
lowest cross-validation (cv) error. Furthermore, ADMIXTURE was also run
using the subset of 381 SNPs
(“unsupervised model”) and all
populations to compare the genetic structure among populations suggested
by the set of species-specific SNPs to the structure suggested by the
complete set of SNPs. ADMIXTURE was run using K values (1 to 15), the
reason for this was the reduction in the number of localities due to the
use of all allopatric samples as a single fixed population for both
species.
To visualize patterns of genetic differentiation among individuals and
populations, we used principal component analyses (PCA) in the R package
SNPRelate v1.6.4, function snpgdsPCA( ) (Zheng et al., 2012),
using the set of 5,702 SNPs and all samples. This was used to determine
whether individuals from the allopatric regions and the two hybrid
regions correspond to different genetic clusters.
2.4.2 Genetic diversity
and Hardy Weinberg equilibrium
To test whether potential hybridization and introgression resulted in
increased genetic diversity in the Spanish hybrid regions, we compared
different genetic diversity estimates for I. elegans and I.
graellsii from the allopatric and the sympatric distribution in each
locality, for all individuals before excluding the hybrids
(F1 and F2 hybrids) and after excluding
hybrids. We calculated number of alleles (A), allelic richness (Ar),
observed heterozygosity (HO), expected heterozygosity
(HE), and nucleotide diversity (π) using the 5,702 SNPs
set. Genetic diversity was measured at both population and regional
levels. Na, and Ar were estimated using the HIERFSTAT package
v0.04-22 (Goudet, 2005) as implemented in R. Allelic richness was
rarefacted for a minimum of eight alleles (or four diploid samples),
which was the lowest sample size used for Hardy-Weinberg equilibrium
testing. Meanwhile Ho and HE were calculated using
PLINK v1.90b6.12 (Purcell et al., 2007), and π and the
percentage of SNPs with missing data (NA) with VCFtools. Kruskal-Wallis
tests and posthoc paired Wilcoxon tests were used to compare the levels
of each diversity estimate between regions (north-west and north-central
hybrid regions, and allopatry).
Population and regional tests for Hardy-Weinberg disequilibrium per
locus were estimated using the 5,702 SNPs using PLINKs mid-p modifier
(Graffelman & Moreno, 2013). Then, the ratios of SNPs at HW
disequilibrium (p < 0.05) to the total number of
genotyped SNPs, as well as average p -values per population were
calculated.