Figures
Figure 1. A schematic showing the proportions of reads
in four different categories. In samples 1, 2 and 3, the first two
categories of read cannot be distinguished by mapping because
(organellar) mitochondrial reads and NUMT reads carry the same SNP
allele (as an example, the A allele). Only the n reads carrying
an alternative allele (T, G or C) can be classified as of NUMT origin.
Sample 4 had another (T, green) allele in its organellar mitochondrial
genome; hence, in this case, the k orange reads carrying the
alternative alleles (A, G or C) can be classified as NUMTs. We wish to
estimate the proportion of the nuclear DNA which consists of NUMTs,v/N , but this ratio cannot be observed directly in any one of
these samples. Fraction m denotes all reads with sequence
similarity to the mitochondrial genome.
Figure 2. Change in frequency of NUMT alleles with the
mitochondrial mapping ratio (un-mapped / mapped). The raw data are
plotted as circles. The vertical stacks of points represent frequencies
of different loci from the same sample, each locus being given a
different colour (from red to violet according to the global average
allele frequency). The lines with the corresponding colour show the best
fit to equation [1]. Both axes are logarithmic. The intercept on the
log scale will correspond to x=1 (half the reads mapping to the
mitochondrial genome, log(1) = 0). The two solid lines correspond to the
minimum proportion of reads mapping to the mitochondrial genome across
all samples (vertical) and to the intercept of the SNPs with the highest
fitted allele frequency (horizontal). The dashed line shows the fitted
relationship for the locus with the highest estimated allele frequency.
The values of the horizontal solid and dashed lines at x=1 are both
estimates of the proportion of the genome composed of NUMTs (intercept
estimate).
(full width) Figure 3. Sequence relationships and spatial
distribution of grasshopper mitotypes. (a) A dendrogram based on (true)
mitochondrial sequence polymorphism between samples of the grasshopperP. pedestris . The scale bar indicates the number of substitutions
per mitochondrial genome. The symbols match the legend in (b). (b) The
locations of the sampling sites and the distribution of two diverged
mitotypes
Figure 4. Insert quantification in diverged populations.Within-individual allele frequencies are shown for sites with fixed
mitochondrial differences between two populations of the grasshopperP. pedestris (top, 111 sites) and the parrot P. varius(bottom, 178 sites). Both histograms share the same x axis. Fat vertical
lines indicate the means, dashed lines the associated standard errors.