Grasshopper data
Our de novo assembly of the P. pedestris mitochondrial genome was 16,008 bp in length. It contained 37 genes: two rRNAs, 13 protein-coding genes, and 22 tRNAs. The highly AT-rich D-loop region contained a direct repeat of 383pb. 14,664bp of the assembly were made up of genes, corresponding to 91.6% of its length. There are 11,176 sites in coding regions, 1416 of which are 4fold degenerate. Overall, the P. pedestris mitochondrial genome is similar in size and structure to those of other animals (Boore, 1999).
The overall data sequenced for each individual corresponded to coverages of 0.056x to 0.34x (median 0.069x). The mapping rates to the mitochondrial reference varied considerably between samples, ranging from 0.08% to 2.6%, reflecting varying ratios of nuclear DNA to (true) mitochondrial DNA among samples. A PCA revealed that the samples clustered in two groups (mitotypes) as we expected from the known chromosomal polymorphism in the species. After excluding individuals with low mapping depth, where genotype calls might have been confounded with NUMT variants, we identified 111 sites that showed mitochondrial variants with fixed allelic differences between two groups, corresponding to a divergence of 0.69% (Figure 3). Using Brower’s (1994) rate of 2.3% pairwise divergence per million years resulted in a crude estimate of 300,000 years for the divergence between the two populations.
Our estimates for the nuclear proportion of NUMTs were 0.056% (95% CI: 0.048%-0.065%) (intercept estimate) and 0.077% (mapping depth estimate). Because there were 111 SNPs with fixed differences between the northern and southern, we could additionally obtain a diverged sites estimates. Fig. 4 shows the allele frequency scores for 111 loci resulting in an overall estimate of 0.055% (SE 0.0012%).