2.5 Pre-processing of reads, de novo assembly and abundance mapping
Raw sequence reads were processed to remove adapters and primers sequences, PCR duplicates, ribosomal RNA (rRNA), host (Felis catus ) reads and poor-quality terminal regions as previously described (Brussel et al., 2020). Briefly, rRNA reads were removed using SortMeRNA and host reads were identified and removed by mapping to the Felis catus genome (Brussel et al., 2020). The filtered metatranscriptomic reads (RNA) were de novo assembled using Trinity version 2.8.5 and the filtered metagenomic reads (cDNA and DNA) were de novoassembled using IDBA-UD version 1.1.2 (Brussel et al., 2020). The contigs were compared to the non-redundant protein database using Diamond version 2.0.4. The taxonomic classification for the filtered reads was calculated using KMA version 1.3.9a (Clausen et al., 2018) and CCMetagen version 1.2.4 (Marcelino et al., 2020) by comparing the filtered paired-end reads to the NCBI nucleotide database that contains all NCBI sequences except those of environmental eukaryotic and prokaryotic, unclassified and artificial origin. In CCMetagen read depth, specified as reads per million (RPM), was calculated and the threshold function was disabled to allow all taxonomy levels to be reported (Marcelino et al., 2020). Read abundance was further calculated by mapping filtered reads to the de novo assembled contigs observed in this dataset using Bowtie2 version 2.3.4.3. Geneious version 2020.2.5 was used to predict ORFs and annotate genomes. The extent of index-hopping between libraries sequenced on the same lane was minimized by comparing contigs and identifying any identical sequences. The library with the highest read abundance for that sequence was then used to exclude any library that had a read abundance below 0.01% of that number.