2.5.1 Principal component analysis
PCA was performed using LASER v 2.04 (Wang et al., 2015) which uses
projection Procrustes analysis for the samples with low depth of
coverage were placed in the context of a reference PCA space constructed
using genotypes of a set of reference individuals with higher coverage
depth. The first PCA analysis with blue, fin and sei whale samples which
included 18 NA blue whales (four present-day and six historical samples
from NWA, and eight present-day samples from NEA); one historical sample
from an Antarctic blue whale (Table 1); seven present-day fin whales
from NA and two sei whales (SRR5665645 and SRR5665646). Historical
samples, NWa-R4, NWa3, NWa4, NWa5, NWa6, NWa-CM1 were included in this
analysis. The second PCA analysis visualized the genetic relationship
among blue whales, which included 12 present-day blue whales sampled
from both sides of the NA, three historical samples from the NWA
(NWa-R4, NWa3 & NWa4) and the historical sample from Antarctica. The
trimmed sequences of the blue, fin and sei whales were reference aligned
to the assembled NA blue whale genome autosomes, variant detected and
VCF format files were generated. The biallelic SNPs for the PCA were
filtered for sites present in at least 50% of the samples,
>10 bases apart, with a quality score of >30,
mapping quality >30, coverage depth of between 3X and 130X
and MAF of >0.1. The sites were filtered for linkage
disequilibrium by eliminating sites with a correlation coefficient
(r2) > 0.8 within a 1kb window and
4,136,458 and 2,620,383 sites were used for the first and second PCA
analysis, respectively. The ancestry reference PCA space for the first
PCA analysis was constructed using 11 present-day NA blue whales with
>20X coverage (Table 1), two present-day fin whales with
>20X coverage and two sei whales with ~10X
coverage. Four NWA and six NEA present-day blue whales with high
coverage and a sei whale (SRR5665645) were included to compute the
reference PCA space for the second PCA analysis.