3.5.1 Structure and principal component analyses
Genetic structure analyses were done using the whole set of 5,702 SNP with ADMIXTURE, without and with supervision (i.e., using the allopatric samples as ancestry references) (Supplementary Figure S4). Analyses without supervision indicated that the most likely number of ancestral populations (i.e., the K with the lowest cv error) was K=2 (Supplementary Figure S5), but it should be noted that the cv errors for K=2 and K=3 were similar. For K=2, the two genetic clusters corresponded very well to I. graellsii and I. elegans , respectively, and for K=3 a third cluster was found among I. eleganspopulations from the north-west hybrid region. For both K=2 and K=3, many samples with admixed ancestry were present (Supplementary Figure S4).
PCA allowed us to cluster I. elegans , I. graellsii and hybrids from allopatry and from the three hybrid regions. The first axis of the PCA explained 39% of the total variation and clearly separatedI. elegans and I. graellsii individuals from allopatric localities, while the second axis explained 2% of the total variation and separated some of the individuals from north-central and north-west hybrid regions (Supplementary Figure S6). Consistent with the ADMIXTURE results, many individuals from the three hybrid regions appeared in the same PCA quadrant as those occupied by the pure species from the allopatric distribution, while hybrids occupied intermediate positions of the first axis.