Bioinformatics analysis
Raw sequence reads from the Illumina MiSeq were evaluated using FASTQC (http:// www.bioinformatics.babraham.ac.uk/projects/fastqc). Raw sequence reads were demultiplexed using a custom Python script before bioinformatic processing using metaBEAT v0.97.11 (https://github.com/HullUnibioinformatics/metaBEAT), which is a custom pipeline that incorporates commonly used open source software. The program Trimmomatic 0.32 (Bolger et al., 2014) was used for quality trimming and removal of locus primers from the raw sequence reads. Average read quality was assessed in 5-bp sliding windows starting from the 3’ of the read, and reads were clipped until the average quality per window was above phred 30. All reads shorter than a defined minimum 90 bp read length were discarded. Sequence pairs were subsequently merged into single high-quality reads using the program FLASH 1.2.11 (Magoč & Salzberg, 2011). Reads surviving quality filtering and trimming were screened for chimeric sequences against a custom, curated reference database using the uchime_ref function implemented in vsearch 1.1 (https://github.com/torognes/vsearch). The reference database was developed at the University of Hull (Hänfling et al., 2016) and supplemented with asp (GenBank accession numbers: MT163435, MT163450, MT163449) and marena whitefish (Coregonus maraena ) (GenBank accession numbers: MT163451, MT163458, MT163460) to represent all fish species in the study catchment. Sequences were clustered at 100% identity using VSEARCH v1.1. Clusters represented by less than three sequences were considered sequencing error and were omitted from further analysis. Nonredundant sets of query sequences were then compared to the reference database using BLAST (Zhang et al., 2000). BLAST output was interpreted using a custom python function, which implements a lowest common ancestor approach for taxonomic assignment, similar to the strategy used by MEGAN 5.10.6 (Huson et al., 2007). BLAST hits were only considered if they possessed a minimum identity of 99% and 90% query coverage.