2.8 Introgression
Gene flow between blue and fin whale genomes was investigated for
present-day and historical samples and gene flow between blue whales and
humpback/sei whale genomes was also explored. Sei whales are the most
closely related species to blue whales, and humpback whales are the
closest related species to fin whales based on the species tree by
Árnason et al. (2018). The gene flow between these large baleen whales
was examined using D-statistics (Green et al., 2010) in the present-day
and historical whale samples to assess if there have been any changes in
frequency of hybridization and introgression through time.
The D-statistics or ABBA/BABA test was used to study introgression
between blue and fin whales. A four-taxon phylogeny of (((Antarctic
Blue, NA Blue), Fin), Minke) was analyzed for present-day and historical
samples. D-statistics’ were estimated for each of the NA blues, compared
to an Antarctic blue, as they are from distinct genetic clusters
identified by the PCA and TREEMIX analysis. Additionally, blue-fin
hybrids have been reported mostly from the NA and the North Pacific
(Pampoulie et al., 2020) and not from Antarctica making the Antarctic
sample better for comparisons with respect to introgression analysis in
blue whales. The minke whale (SRR896642, Yim et al. 2014) was the
outgroup. The whole genome sequences aligned to the masked autosomal NA
blue whale assembly were used in this analysis. The ABBA/BABA tests,
where “A” is the ancestral allele and “B” is the derived allele,
were performed in ANGSD. The sites were filtered for quality score of
>20 and mapping quality >30. The SNPs from the
historical samples were further filtered to remove deaminated cytosine
residues using the -rmTrans parameter. The jackknife procedure was used
for standard error estimations. Similarly, to study blue-humpback
(Tollis et al., 2019) whale and blue-sei (SRR5665645) whale
introgression, the analyses were conducted for (((Antarctic Blue, NA
Blue), Humpback), Minke) and (((Antarctic Blue, NA Blue), Sei), Minke).
To detect the direction of gene flow and quantify introgression the
statistic Dfoil (Pease & Hahn, 2014) was employed. Dfoil is a
five-taxon test and the phylogenetic relationship tested here was
(((Sei, NA Blue), (Fin, Humpback)), Minke). The Dfoil analyses included
VCF format files from whole genome sequences of blue, sei (SRR5665645),
fin, humpback (Tollis et al., 2019) and minke (SRR896642) whales aligned
to masked autosomal NA blue whale assembly that were processed as
described above. The SNPs were filtered for missing >0.50,
>10 bases apart, quality score of >30, mapping
quality >30, coverage depth of between 3X and 130X and MAF
>0.1 using VCFTOOLS and PLINK (Purcell et al., 2007). The
fin whale used in the introgression analysis was also tested against
another known fin whale to verify its genetic identity, which was
consistent with the PCA analysis (Fig.2B) wherein it clustered with the
other seven NA fin whales.
Dfoil analyses with a phylogeny of (((NA Blue 1, NA Blue 2), (Fin,
Humpback)), Minke) were also examined. The historical whale samples and
a low coverage sample were not included in Dfoil analysis due to shallow
genome coverage.