2.8 Introgression
Gene flow between blue and fin whale genomes was investigated for present-day and historical samples and gene flow between blue whales and humpback/sei whale genomes was also explored. Sei whales are the most closely related species to blue whales, and humpback whales are the closest related species to fin whales based on the species tree by Árnason et al. (2018). The gene flow between these large baleen whales was examined using D-statistics (Green et al., 2010) in the present-day and historical whale samples to assess if there have been any changes in frequency of hybridization and introgression through time.
The D-statistics or ABBA/BABA test was used to study introgression between blue and fin whales. A four-taxon phylogeny of (((Antarctic Blue, NA Blue), Fin), Minke) was analyzed for present-day and historical samples. D-statistics’ were estimated for each of the NA blues, compared to an Antarctic blue, as they are from distinct genetic clusters identified by the PCA and TREEMIX analysis. Additionally, blue-fin hybrids have been reported mostly from the NA and the North Pacific (Pampoulie et al., 2020) and not from Antarctica making the Antarctic sample better for comparisons with respect to introgression analysis in blue whales. The minke whale (SRR896642, Yim et al. 2014) was the outgroup. The whole genome sequences aligned to the masked autosomal NA blue whale assembly were used in this analysis. The ABBA/BABA tests, where “A” is the ancestral allele and “B” is the derived allele, were performed in ANGSD. The sites were filtered for quality score of >20 and mapping quality >30. The SNPs from the historical samples were further filtered to remove deaminated cytosine residues using the -rmTrans parameter. The jackknife procedure was used for standard error estimations. Similarly, to study blue-humpback (Tollis et al., 2019) whale and blue-sei (SRR5665645) whale introgression, the analyses were conducted for (((Antarctic Blue, NA Blue), Humpback), Minke) and (((Antarctic Blue, NA Blue), Sei), Minke).
To detect the direction of gene flow and quantify introgression the statistic Dfoil (Pease & Hahn, 2014) was employed. Dfoil is a five-taxon test and the phylogenetic relationship tested here was (((Sei, NA Blue), (Fin, Humpback)), Minke). The Dfoil analyses included VCF format files from whole genome sequences of blue, sei (SRR5665645), fin, humpback (Tollis et al., 2019) and minke (SRR896642) whales aligned to masked autosomal NA blue whale assembly that were processed as described above. The SNPs were filtered for missing >0.50, >10 bases apart, quality score of >30, mapping quality >30, coverage depth of between 3X and 130X and MAF >0.1 using VCFTOOLS and PLINK (Purcell et al., 2007). The fin whale used in the introgression analysis was also tested against another known fin whale to verify its genetic identity, which was consistent with the PCA analysis (Fig.2B) wherein it clustered with the other seven NA fin whales.
Dfoil analyses with a phylogeny of (((NA Blue 1, NA Blue 2), (Fin, Humpback)), Minke) were also examined. The historical whale samples and a low coverage sample were not included in Dfoil analysis due to shallow genome coverage.