2.1 Genome assembly and annotation
Muscle tissue for the genome assembly was collected by the Royal Ontario
Museum (ROM), Toronto, ON, with the approval of the Minister of
Fisheries and Oceans, Canada (SARA permit ref: NLSAR-003-14) from a
female blue whale that died close to Newfoundland in 2014 (NW-M6, Fig.
1, Table 1). The Illumina and Pacbio reads for the genome assembly were
generated at The Centre for Applied Genomics (TCAG), The Hospital for
Sick Children, Toronto, Canada. The genome was assembled using the
hybrid assembler MASURCA v 3.2.8 (Koren et al., 2012; see Supplemental
Information for more details).
RNA for the transcriptome assembly was collected from a skin biopsy of a
blue whale sampled in the Svalbard Archipelago (79°N), Norway (Fig. 1).
The paired-end RNAseq data were generated using HiSeq 2500 at TCAG. The
transcripts were assembled with TRINITY (Grabherr et al., 2011) and
TOPHAT (Trapnell et al., 2009) as elaborated in Supplemental
Information. The masked genome was annotated using the MAKER2 (Holt &
Yandell, 2011) pipeline with the blue whale transcriptome, NCBI proteins
for cow and all cetaceans as explained in Supplemental Information.
Functional annotations of the predicted genes were done by BLASTp
(Altschul et al., 1997) hits to UNIPROT (UniProt Consortium 2015) using
an E-value of <1e-6.