The relative amounts of extranuclear and insert DNA
Across all three species and individuals analysed, the proportion of the data that could be aligned successfully to the appropriate mitochondrial reference was less than 3%, indicating that the largest part of each (cleaned) data set comprised nuclear DNA. Mitochondrial variant calling within each individual sample revealed highly variable allele frequencies in different individuals, consistent with different ratios of NUMTs to mitochondrial DNA. The allele frequencies at specific sites were correlated among samples and covaried with the proportion of the data that could be aligned to the mitochondrial genome (diagonal lines in Figure 2). This linear relationship is predicted by equation [1]: Because most of the variation was due to differences in the proportion of mitochondrial DNA in different samples (rather than differences in allele frequencies in the NUMTs), samples with more mitochondrial DNA had lower NUMT allele frequencies and vice versa.