SNP filtering
After receiving the SNP data from DArT Pty Ltd, the SNP data and associated metadata were read into a genlight object as implemented in R package adegenet (Jombart, 2008) to facilitate subsequent processing with R package dartR (Gruber et al., 2018). We created two different datasets based upon different filtering of the initial 19903 polymorphic SNP loci, one for the phylogenetic analysis (‘phylo’ dataset), and the other for the PCoA and fixed difference analyses (‘PCoA’ dataset). The phylo dataset was initially filtered to remove any obviously-introgressed individuals within the MDB (identified using the PCoA dataset), as reticulation events are not compatible with bifurcating trees. The next step retained only loci for which repeatability was greater than 0.99 and all loci with a callrate above 0.6. The PCoA dataset included all individuals and was first filtered for repeatability to include values greater than 0.99. The second filtering step removed all secondary loci (loci found within the same sequenced fragment) with the locus retained having the higher polymorphism information content (PIC) value. Finally, loci with a callrate above 0.9 were retained. The additional filtering steps were undertaken on the PCoA dataset for the two analyses that are sensitive to the presence of too many missing values and/or tightly-linked loci (ordination and the calculation of fixed differences). The data remaining after these primary filtering steps for both datasets are regarded as highly reliable. The PCoA dataset was used for each of the additional (stepwise) PCoA analyses based on a subset of individuals being compared, with additional filtering applied to remove any loci that become monomorphic in such subsets.