Divergence mapping
Two new Pacific lamprey genome assemblies were constructed using the whole genome sequence from the milt and blood from a male (representing the gametic and somatic genomes; Genbank Assession #: PRJNA613923) and the blood of a female (Genbank Assession #:XXXXX), and using a high density linkage map (Smith et al. 2018) to validate and extend higher order scaffolding of chromosomes (Supplemental Materials).
For characterization of SNP densities and F STstatistics, we used a set of 7,716 unique SNP loci from previously published RAD-seq datasets (Hess et al. 2013; Smith et al.2018), which passed a set of population genetic quality control filters (Supplemental Materials). This set of 7,716 unique SNPs was a combination of overlapping groups of SNPs from a previous dataset (Hesset al. 2013; SNPs N = 8,772) and a de novo linkage mapping dataset (Smith et al. 2018; SNPs N = 7,977). BOWTIE2 (Langmead and Salzberg 2012) was used to align these two datasets to the male reference assembly to define homologous loci. For the 7,716 total SNPs passing the QC filters, 4,046 loci were unique to Hess et al.2013, 1,418 loci were unique to Smith et al. 2018, and 2,252 SNPs were shared across datasets. Marker positions based on BOWTIE2 alignments were compared between Pacific lamprey male and female genomes and the Pacific lamprey male and sea lamprey male gametic genome (GenBank assembly accession: GCA_002833325.1) to characterize synteny.
Using these 7,716 SNPs genotyped for the same individuals from Hess et al. (2013; i.e., 16 collections with >20 individuals which totaled 482 individuals; Table S1), LOSITAN (Antao et al. 2008) was run using parameter settings of 50,000 simulations, confidence interval of 0.99, false discovery rate set to 0.1, subsample size of 20, simulated F ST of 0.019 and an attemptedF ST of 0.021. We considered loci candidates for positive selection above a probability level of 0.995, and neutral loci were defined as falling between the 10th and 90th quantiles of theF ST distribution. Any remaining SNPs were conservatively considered undetermined (neither candidates nor neutral).
Genes located within adaptive regions were identified using published sea lamprey gene annotations that were found in the homologous regions corresponding to the following Pacific lamprey male genome positions 1) chromosome 01 positions: 8939466…14772759 (sea lamprey scaf_00003: 6777250…13554086), 2) chromosome 02 positions: 3351206…18794404 (sea lamprey scaf_00006:1198871-13859281), 3) chromosome 04 positions: 6408032…19202839 (sea lamprey scaf_00005: 2591251…16864119), and 4) chromosome 22 positions: 617460…11364740) (sea lamprey scaf_00012: 1160196…12993068). We used the website Enrichr (https://amp.pharm.mssm.edu/Enrichr/) (Kuleshov et al. 2016) to gain insights into the potential function of these genes via both the manifested phenotypes in mammals (i.e., MGI Mammalian Phenotype Level 4 2019) and in fishes (FishEnrichr; Phenotype AutoRIF Predicted Z score).