The relative amounts of extranuclear and insert DNA
Across all three species and individuals analysed, the proportion of the
data that could be aligned successfully to the appropriate mitochondrial
reference was less than 3%, indicating that the largest part of each
(cleaned) data set comprised nuclear DNA. Mitochondrial variant calling
within each individual sample revealed highly variable allele
frequencies in different individuals, consistent with different ratios
of NUMTs to mitochondrial DNA. The allele frequencies at specific sites
were correlated among samples and covaried with the proportion of the
data that could be aligned to the mitochondrial genome (diagonal lines
in Figure 2). This linear relationship is predicted by equation [1]:
Because most of the variation was due to differences in the proportion
of mitochondrial DNA in different samples (rather than differences in
allele frequencies in the NUMTs), samples with more mitochondrial DNA
had lower NUMT allele frequencies and vice versa.