Figure 5. vAMPirus-generated dinoRNAV major capsid protein gene ASV and aminotype alpha (I, II) and beta (III, IV) diversity results from stony coral colonies (Acropora sp., Pocilloporaspecies complex) and corallivorous (coral-eating) fish feces.Plots include three sample types: 1. Acropora biopsies (blue, triangle), 2. Corallivore feces (gray, x), and 3. Pocilloporabiopsies (red, diamond). Letters beneath x axis labels on richness box plots (I, II) indicate statistically different groups. ASV and aminotype based NMDS plots (III, IV) were generated with Bray Curtis distances (stress values of 0.04 and 0.03, respectively).
Discussion
Targeted gene sequencing is increasingly being applied to explore spatiotemporal patterns of viral diversity (Adriaenssens & Cowan, 2014; Finke & Suttle, 2019; Frantzen & Holo, 2019; Grupstra et al., 2022; Gustavsen & Suttle, 2021; Howe-Kerr et al., 2022; Y. Li et al., 2018; Montalvo-Proaño et al., 2017; Prodinger et al., 2020; Short et al., 2010; Tong et al., 2016). The field of virology can now greatly benefit from the development of readily standardizable and reproducible pipelines for analyzing amplicon sequence datasets. Here, we present vAMPirus; a freely available, powerful, and flexible bioinformatics tool that streamlines the processing, analysis, and visualization of virus gene amplicon data. The availability of diverse bioinformatics approaches and tools within the vAMPirus program (e.g., ASV calling, clustering, translation, phylogenetic clustering) empowers the user to adapt and set informed standards for their study system and easily share these standards with colleagues. With a user-friendly design and robust documentation, vAMPirus democratizes comprehensive virus amplicon sequencing analyses, making it a timely and valuable tool for virologists.
To inform virus amplicon data analyses, virologists have primarily relied on pipelines and tutorials geared towards bacterial or microeukaryote amplicon data (e.g., mothur (Schloss, 2020) and QIIME2 (Bolyen et al., 2019)). Although valuable insights have been made using these resources, an accessible virus-focused amplicon analysis pipeline will advance the field by offering via (1) automated pipelines that standardize approaches for viral amplicon analyses (e.g., ASV and aminotype calling); (2) non-cluster-based alternatives to partitioning virus gene sequences (e.g., MED and phylogrouping); and (3) virus-focused taxonomy databases. Virus amplicon analyses have traditionally applied de novo clustering of marker gene sequences into de novo OTUs based on a percent identity value (i.e., 97% nucleotide identity, Callahan et al., 2017). However, clustering virus amplicons into biologically accurate de novo OTUs is challenging as the optimal clustering percentage is often unknown. vAMPirus provides users with the opportunity to transition from traditional de novoOTUs in virus amplicon sequencing analyses to using ASVs and aminotypes. We have illustrated here that ASV and aminotype-based analyses generally recapitulate findings generated via de novo OTU-based analyses (Figures 3, 4; Supplemental Table S1), while enabling reproducibility and cross-study comparisons (Callahan et al., 2017). Running analyses of amino acid and nucleotide sequence data in tandem, which is possible in vAMPirus, can aid in resolving virus phylogenies and reveal non-synonymous mutations that indicate virus protein property variability within a community (DeFilippis & Villarreal, 2000). This synergistic approach has been effective in developing dinoRNAVs and their dinoflagellate hosts (family Symbiodiniaceae, endosymbionts of stony corals) as a nascent study system. To characterize dinoRNAVs, studies have used the mcp gene, which has a high mutation rate and is hypothesized to be important in host cell attachment (Tomaru et al., 2004). vAMPirus aminotyping uncovered non-synonymous mutations in dinoRNAV mcp sequences, which may represent phenotypic differences that correlate with the distribution of host lineages across reefs (Grupstra et al., 2022; Howe-Kerr et al., 2022, this study). Aminotyping also effectively reduced noise from high mutation rates in ASV results, revealing temperature-driven increases in dinoRNAV infection productivity and community diversity across time and space (Grupstra et al. 2022: time only, Howe-Kerr et al., 2022). By making viral protein sequence analyses readily accessible in an amplicon sequence analysis workflow, vAMPirus helps reveal biological patterns in DNA and highly mutable RNA virus lineages by increasing signal-to-noise ratio in results (through collapse of synonymous nucleotide mutations, (Wernersson & Pedersen, 2003).
The increasing application of amplicon sequencing to the study of microbial diversity and dynamics has spurred efforts to improve the proficiency of tools that parse marker gene data. Such tools include the programs TreeCluster (Balaban et al., 2019) and oligotyping (Eren et al., 2015), which were developed as de novo clustering alternatives for partitioning genetic sequences into distinct units. In vAMPirus, these programs are utilized to assign ASVs and aminotypes to phylo- or MED groups based on user-set criteria (see Section 2.2.2). Assignment of ASV/aminotype sequences to groups rather than use of cluster representative sequences in analyses (such as, in the case ofde novo OTUs and cASVs, Callahan et al., 2017) is done by vAMPirus to maintain reproducibility and comparability of results, while still permitting virus sequence classification into phylogenetically or ecologically distinct groups. These grouping approaches are instrumental for investigators because they can expose underlying patterns obscured by high sequence diversity (e.g., lactococcal phage phylogrouping results, Section 3, Figure 4). Phylogeny-based sequence clustering with TreeCluster has been applied to assess the diversity of microorganisms (and barley, Chen et al., 2022) and has been used to resolved virus transmission dynamics (HIV, Balaban et al., 2019; SARS-CoV-2, Plyusnin et al., 2022) and phylogenies (Ni et al., 2023). However, TreeCluster’s potential utility for virus amplicon analyses is, for the most part, untapped. The inclusion of TreeCluster in the vAMPirus pipeline also opens the door to epidemiological insights, such as virus genetic linkage, transmission dynamics, and subpopulation mixing, from viral datasets (Balaban et al., 2019; Bezemer et al., 2015; Eshleman et al., 2011; Hué et al., 2014). Similarly, the program oligotyping developed by Eren et al. (2015) has been applied extensively to investigate microorganism diversity from marker gene data (cited 332 times as of February 2, 2023, Web of Science). However, only one published study has applied Minimum Entropy Decomposition sequence clustering with oligotyping to virus amplicon data (Needham et al., 2017). The MED grouping with oligotyping option provided by vAMPirus is a powerful approach for deciphering virus community diversity because it enables the grouping of sequences based on potential physiologically and/or ecologically relevant similarities. For example, users can identify gene sequence positions with non-synonymous mutations via aminotyping and then specify these positions in MED grouping to partition sequences into units of similar protein phenotypes (i.e., host cell attachment; see Harvey et al., 2021). The option to incorporate cutting-edge bioinformatic approaches, such as phylogrouping and MED grouping, into analyses of virus amplicon data makes vAMPirus a highly useful “raw-reads-to-results” environmental virology workflow.
vAMPirus is an easy-to-use, open-source, and flexible tool that streamlines and simplifies the process of analyzing viral amplicon data. vAMPirus is designed to be community-driven; new features and programs (e.g., built-in lineage specific configuration files or databases, new bioinformatic tools) can easily be implemented at the request of investigators or when advances in best practices are made. vAMPirus advances studies of viral community diversity by facilitating informed analyses of amplicon sequence data with its DataCheck and Analyze pipelines in a standardized and reproducible manner.
Acknowledgments
The authors would like to thank Jan F. Finke, Curtis A. Suttle, Julia A. Gustavsen, Cyril Frantzen, Hedge Holo, Florian Prodinger and Hiroyuki Ogata (and their respective co-authors) for providing raw virus amplicon sequence data for vAMPirus testing and for permission to reprint original figures. We thank Samantha R. Coy for feedback and insight during early stages of vAMPirus development, and Nikolaos Schizas for access to computational resources during vAMPirus development. This work represents a contribution of the Moorea Coral Reef (MCR) LTER Site (NSF OCE 16–37396). This research was funded by a U.S. National Science Foundation Grant OCE 22-24354 (and earlier awards) to the Moorea Coral Reef LTER as well as a generous gift from the Gordon and Betty Moore Foundation. Research was completed under permits issued by the Territorial Government of French Polynesia (Délégation à la Recherche) and the Haut-Commissariat de la République en Polynésie Francaise (DTRT) (MCR LTER Protocole d’Accueil 2005–2022; Adrienne Correa  Protocole d’Accueil 2013-2019), and we thank the Délégation à la Recherche and DTRT for their continued support. Start-up funds from Rice University, a NSF CAREER Award (OCE-2145472) and an Early-Career Research Fellowship (#2000009651) from the Gulf Research Program of the National Academies of Sciences to AMSC also contributed to this work. Additional funding was provided by Lewis and Clark Grants for Exploration to LHK and CGG, and Wagoner Foreign Study awards and to LHK, CGG, and AJV. The Kirk W. Dotson Endowed Graduate Fellowship in Ecology and Evolutionary Biology also helped support AJV as this work was conducted. RERV acknowledges funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 764840 (ITN IGNITE).
References
Adriaenssens, E. M., & Cowan, D. A. (2014). Using signature genes as tools to assess environmental viral ecology and diversity. Applied and Environmental Microbiology , 80 (15), 4470–4480. https://doi.org/10.1128/AEM.00878-14
Balaban, M., Moshiri, N., Mai, U., Jia, X., & Mirarab, S. (2019). TreeCluster: Clustering biological sequences using phylogenetic trees.PLOS ONE , 14 (8), e0221068. https://doi.org/10.1371/journal.pone.0221068
Bezemer, D., Cori, A., Ratmann, O., Sighem, A. van, Hermanides, H. S., Dutilh, B. E., Gras, L., Faria, N. R., Hengel, R. van den, Duits, A. J., Reiss, P., Wolf, F. de, Fraser, C., & Cohort, A. observational. (2015). Dispersion of the HIV-1 Epidemic in Men Who Have Sex with Men in the Netherlands: A Combined Mathematical Model and Phylogenetic Analysis.PLOS Medicine , 12 (11), e1001898. https://doi.org/10.1371/journal.pmed.1001898
Bigot, T., Temmam, S., Pérot, P., & Eloit, M. (2020). RVDB-prot, a reference viral protein database and its HMM profiles (8:530). F1000Research. https://doi.org/10.12688/f1000research.18776.2
Bolyen, E., Rideout, J. R., Dillon, M. R., Bokulich, N. A., Abnet, C. C., Al-Ghalith, G. A., Alexander, H., Alm, E. J., Arumugam, M., Asnicar, F., Bai, Y., Bisanz, J. E., Bittinger, K., Brejnrod, A., Brislawn, C. J., Brown, C. T., Callahan, B. J., Caraballo-Rodríguez, A. M., Chase, J., … Caporaso, J. G. (2019). Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology , 37 (8), 852–857. https://doi.org/10.1038/s41587-019-0209-9
Braga, L. P. P., Spor, A., Kot, W., Breuil, M.-C., Hansen, L. H., Setubal, J. C., & Philippot, L. (2020). Impact of phages on soil bacterial communities and nitrogen availability under different assembly scenarios. Microbiome , 8 (1), 52. https://doi.org/10.1186/s40168-020-00822-z
Breitbart, M., Bonnain, C., Malki, K., & Sawaya, N. A. (2018). Phage puppet masters of the marine microbial realm. Nature Microbiology , 3 (7), 754–766.
Brister, J. R., Ako-Adjei, D., Bao, Y., & Blinkova, O. (2015). NCBI viral genomes resource. Nucleic Acids Research ,43 (Database issue), D571-577. https://doi.org/10.1093/nar/gku1207
Buchfink, B., Xie, C., & Huson, D. H. (2015). Fast and sensitive protein alignment using DIAMOND. Nature Methods , 12 (1), Article 1. https://doi.org/10.1038/nmeth.3176
Callahan, B. J., McMurdie, P. J., & Holmes, S. P. (2017). Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. The ISME Journal , 11 (12), Article 12. https://doi.org/10.1038/ismej.2017.119
Capella-Gutiérrez, S., Silla-Martínez, J. M., & Gabaldón, T. (2009). trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics , 25 (15), 1972–1973. https://doi.org/10.1093/bioinformatics/btp348
Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34 (17), i884–i890. https://doi.org/10.1093/bioinformatics/bty560
Chen, Y.-Y., Schreiber, M., Bayer, M. M., Dawson, I. K., Hedley, P. E., Lei, L., Akhunova, A., Liu, C., Smith, K. P., Fay, J. C., Muehlbauer, G. J., Steffenson, B. J., Morrell, P. L., Waugh, R., & Russell, J. R. (2022). The evolutionary patterns of barley pericentromeric chromosome regions, as shaped by linkage disequilibrium and domestication.The Plant Journal , 111 (6), 1580–1594. https://doi.org/10.1111/tpj.15908
Correa, A. M. S., Howard-Varona, C., Coy, S. R., Buchan, A., Sullivan, M. B., & Weitz, J. S. (2021). Revisiting the rules of life for viruses of microorganisms. Nature Reviews Microbiology , 19 (8), 501–513. https://doi.org/10.1038/s41579-021-00530-x
Correa, A. M. S., Welsh, R. M., & Vega Thurber, R. L. (2013). Unique nucleocytoplasmic dsDNA and +ssRNA viruses are associated with the dinoflagellate endosymbionts of corals. The ISME Journal ,7 (1), Article 1. https://doi.org/10.1038/ismej.2012.75
Darriba, D., Posada, D., Kozlov, A. M., Stamatakis, A., Morel, B., & Flouri, T. (2020). ModelTest-NG: A New and Scalable Tool for the Selection of DNA and Protein Evolutionary Models. Molecular Biology and Evolution , 37 (1), 291–294. https://doi.org/10.1093/molbev/msz189
DeFilippis, V. R., & VillarreaI, L. P. (2000). An Introduction to the Evolutionary Ecology of Viruses. Viral Ecology , 125–208. https://doi.org/10.1016/B978-012362675-2/50005-7
Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology , 35 (4), Article 4. https://doi.org/10.1038/nbt.3820
Domingo, E., & Perales, C. (2019). Viral quasispecies. PLOS Genetics , 15 (10), e1008271. https://doi.org/10.1371/journal.pgen.1008271
Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics , 26 (19), 2460–2461. https://doi.org/10.1093/bioinformatics/btq461
Edgar, R. C. (2016a). UCHIME2: Improved chimera prediction for amplicon sequencing (p. 074252). bioRxiv. https://doi.org/10.1101/074252
Edgar, R. C. (2016b). UNOISE2: Improved error-correction for Illumina 16S and ITS amplicon sequencing (p. 081257). bioRxiv. https://doi.org/10.1101/081257
Edgar, R. C. (2021). MUSCLE v5 enables improved estimates of phylogenetic tree confidence by ensemble bootstrapping (p. 2021.06.20.449169). bioRxiv. https://doi.org/10.1101/2021.06.20.449169
Eren, A. M., Morrison, H. G., Lescault, P. J., Reveillaud, J., Vineis, J. H., & Sogin, M. L. (2015). Minimum entropy decomposition: Unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences. The ISME Journal , 9 (4), Article 4. https://doi.org/10.1038/ismej.2014.195
Eshleman, S. H., Hudelson, S. E., Redd, A. D., Wang, L., Debes, R., Chen, Y. Q., Martens, C. A., Ricklefs, S. M., Selig, E. J., Porcella, S. F., Munshaw, S., Ray, S. C., Piwowar-Manning, E., McCauley, M., Hosseinipour, M. C., Kumwenda, J., Hakim, J. G., Chariyalertsak, S., de Bruyn, G., … Hughes, J. P. (2011). Analysis of Genetic Linkage of HIV From Couples Enrolled in the HIV Prevention Trials Network 052 Trial. The Journal of Infectious Diseases , 204 (12), 1918–1926. https://doi.org/10.1093/infdis/jir651
Finke, J. F., & Suttle, C. A. (2019). The Environment and Cyanophage Diversity: Insights From Environmental Sequencing of DNA Polymerase.Frontiers in Microbiology , 10 . https://www.frontiersin.org/article/10.3389/fmicb.2019.00167
Frantzen, C. A., & Holo, H. (2019). Unprecedented Diversity of Lactococcal Group 936 Bacteriophages Revealed by Amplicon Sequencing of the Portal Protein Gene. Viruses , 11 (5), Article 5. https://doi.org/10.3390/v11050443
Fu, L., Niu, B., Zhu, Z., Wu, S., & Li, W. (2012). CD-HIT: Accelerated for clustering the next-generation sequencing data.Bioinformatics , 28 (23), 3150–3152. https://doi.org/10.1093/bioinformatics/bts565
Grupstra, C. G. B., Lemoine, N. P., Cook, C., & Correa, A. M. S. (2022). Thank you for biting: Dispersal of beneficial microbiota through “antagonistic” interactions. Trends in Microbiology ,30 (10), 930–939. https://doi.org/10.1016/j.tim.2022.03.006
Grupstra, C. G. B., Rabbitt, K. M., Howe-Kerr, L. I., & Correa, A. M. S. (2021). Fish predation on corals promotes the dispersal of coral symbionts. Animal Microbiome , 3 (1), 25. https://doi.org/10.1186/s42523-021-00086-4
Grupstra, C. G., Howe-Kerr, L. I., Veglia, A. J., Bryant, R. L., Coy, S. R., Blackwelder, P. L., & Correa, A. (2022). Thermal stress triggers productive viral infection of a key coral reef symbiont. The ISME Journal , 1–12.
Gustavsen, J. A., & Suttle, C. A. (2021). Role of Phylogenetic Structure in the Dynamics of Coastal Viral Assemblages. Applied and Environmental Microbiology , 87 (11), e02704-20. https://doi.org/10.1128/AEM.02704-20
Harvey, W. T., Carabelli, A. M., Jackson, B., Gupta, R. K., Thomson, E. C., Harrison, E. M., Ludden, C., Reeve, R., Rambaut, A., Peacock, S. J., & Robertson, D. L. (2021). SARS-CoV-2 variants, spike mutations and immune escape. Nature Reviews Microbiology , 19 (7), Article 7. https://doi.org/10.1038/s41579-021-00573-0
Howe-Kerr, L. I. (2022). Viruses of a key coral symbiont exhibit temperature-driven productivity across a reefscape . https://doi.org/10.21203/rs.3.rs-1899377/v1
Hué, S., Brown, A. E., Ragonnet-Cronin, M., Lycett, S. J., Dunn, D. T., Fearnhill, E., Dolling, D. I., Pozniak, A., Pillay, D., Delpech, V. C., & Leigh Brown, A. J. (2014). Phylogenetic analyses reveal HIV-1 infections between men misclassified as heterosexual transmissions.AIDS , 28 (13), 1967. https://doi.org/10.1097/QAD.0000000000000383
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A., & Jermiin, L. S. (2017). ModelFinder: Fast model selection for accurate phylogenetic estimates. Nature Methods , 14 (6), Article 6. https://doi.org/10.1038/nmeth.4285
Labadie, T., Batéjat, C., Leclercq, I., & Manuguerra, J.-C. (2020). Historical Discoveries on Viruses in the Environment and Their Impact on Public Health. Intervirology , 63 (1–6), 17–32. https://doi.org/10.1159/000511575
Li, W., & Godzik, A. (2006). Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences.Bioinformatics , 22 (13), 1658–1659. https://doi.org/10.1093/bioinformatics/btl158
Li, Y., Hingamp, P., Watai, H., Endo, H., Yoshida, T., & Ogata, H. (2018). Degenerate PCR Primers to Reveal the Diversity of Giant Viruses in Coastal Waters. Viruses , 10 (9), 496. https://doi.org/10.3390/v10090496
Metcalf, T. G., Melnick, J. L., & Estes, M. K. (1995). ENVIRONMENTAL VIROLOGY: From Detection of Virus in Sewage and Water by Isolation to Identification by Molecular Biology—A Trip of Over 50 Years.Annual Review of Microbiology , 49 (1), 461–487. https://doi.org/10.1146/annurev.mi.49.100195.002333
Minh, B. Q., Schmidt, H. A., Chernomor, O., Schrempf, D., Woodhams, M. D., von Haeseler, A., & Lanfear, R. (2020). IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.Molecular Biology and Evolution , 37 (5), 1530–1534. https://doi.org/10.1093/molbev/msaa015
Montalvo-Proaño, J., Buerger, P., Weynberg, K. D., & van Oppen, M. J. H. (2017). A PCR-Based Assay Targeting the Major Capsid Protein Gene of a Dinorna-Like ssRNA Virus That Infects Coral Photosymbionts.Frontiers in Microbiology , 8 . https://www.frontiersin.org/articles/10.3389/fmicb.2017.01665
Needham, D. M., Sachdeva, R., & Fuhrman, J. A. (2017). Ecological dynamics and co-occurrence among marine phytoplankton, bacteria and myoviruses shows microdiversity matters. The ISME Journal ,11 (7), Article 7. https://doi.org/10.1038/ismej.2017.29
Ni, X.-B., Cui, X.-M., Liu, J.-Y., Ye, R.-Z., Wu, Y.-Q., Jiang, J.-F., Sun, Y., Wang, Q., Shum, M. H.-H., Chang, Q.-C., Zhao, L., Han, X.-H., Ma, K., Shen, S.-J., Zhang, M.-Z., Guo, W.-B., Zhu, J.-G., Zhan, L., Li, L.-J., … Cao, W.-C. (2023). Metavirome of 31 tick species provides a compendium of 1,801 RNA virus genomes. Nature Microbiology , 8 (1), Article 1. https://doi.org/10.1038/s41564-022-01275-w
O’Leary, N. A., Wright, M. W., Brister, J. R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse, B., Smith-White, B., Ako-Adjei, D., Astashyn, A., Badretdin, A., Bao, Y., Blinkova, O., Brover, V., Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., … Pruitt, K. D. (2016). Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Research , 44 (D1), D733-745. https://doi.org/10.1093/nar/gkv1189
Paez-Espino, D., Chen, I.-M. A., Palaniappan, K., Ratner, A., Chu, K., Szeto, E., Pillay, M., Huang, J., Markowitz, V. M., Nielsen, T., Huntemann, M., K. Reddy, T. B., Pavlopoulos, G. A., Sullivan, M. B., Campbell, B. J., Chen, F., McMahon, K., Hallam, S. J., Denef, V., … Kyrpides, N. C. (2017). IMG/VR: A database of cultured and uncultured DNA Viruses and retroviruses. Nucleic Acids Research ,45 (D1), gkw1030. https://doi.org/10.1093/nar/gkw1030
Plyusnin, I., Truong Nguyen, P. T., Sironen, T., Vapalahti, O., Smura, T., & Kant, R. (2022). ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies. BMC Bioinformatics ,23 (1), 196. https://doi.org/10.1186/s12859-022-04709-8
Prodinger, F., Endo, H., Gotoh, Y., Li, Y., Morimoto, D., Omae, K., Tominaga, K., Blanc-Mathieu, R., Takano, Y., Hayashi, T., Nagasaki, K., Yoshida, T., & Ogata, H. (2020). An Optimized Metabarcoding Method for Mimiviridae. Microorganisms , 8 (4), Article 4. https://doi.org/10.3390/microorganisms8040506
Rognes, T., Flouri, T., Nichols, B., Quince, C., & Mahé, F. (2016). VSEARCH: A versatile open source tool for metagenomics. PeerJ ,4 , e2584. https://doi.org/10.7717/peerj.2584
Schloss, P. D. (2020). Reintroducing mothur: 10 Years Later.Applied and Environmental Microbiology , 86 (2). https://doi.org/10.1128/AEM.02343-19
Schoch, C. L., Ciufo, S., Domrachev, M., Hotton, C. L., Kannan, S., Khovanskaya, R., Leipe, D., Mcveigh, R., O’Neill, K., Robbertse, B., Sharma, S., Soussov, V., Sullivan, J. P., Sun, L., Turner, S., & Karsch-Mizrachi, I. (2020). NCBI Taxonomy: A comprehensive update on curation, resources and tools. Database: The Journal of Biological Databases and Curation , 2020 , baaa062. https://doi.org/10.1093/database/baaa062
Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal , 27 (3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Short, S. M., Chen, F., & Wilhelm, S. (2010). The construction and analysis of marker gene libraries. Manual of Aquatic Viral Ecology , 82–91.
Suttle, C. A. (2007). Marine viruses—Major players in the global ecosystem. Nature Reviews Microbiology , 5 (10), 801–812. https://doi.org/10.1038/nrmicro1750
Thurber, R. V., Payet, J. P., Thurber, A. R., & Correa, A. M. S. (2017). Virus–host interactions and their roles in coral reef health and disease. Nature Reviews Microbiology , 15 (4), 205–216. https://doi.org/10.1038/nrmicro.2016.176
Tomaru, Y., Katanozaka, N., Nishida, K., Shirai, Y., Tarutani, K., Yamaguchi, M., & Nagasaki, K. (2004). Isolation and characterization of two distinct types of HcRNAV, a single-stranded RNA virus infecting the bivalve-killing microalga Heterocapsa circularisquama. Aquatic Microbial Ecology , 34 (3), 207–218. https://doi.org/10.3354/ame034207
Tong, Y., Liu, B., Liu, H., Zheng, H., Gu, J., Liu, H., Lin, M., Ding, Y., Song, C., & Li, Y. (2016). New universal primers for genotyping and resistance detection of low HBV DNA levels. Medicine ,95 (33), e4618. https://doi.org/10.1097/MD.0000000000004618
Uyaguari-Diaz, M. I., Chan, M., Chaban, B. L., Croxen, M. A., Finke, J. F., Hill, J. E., Peabody, M. A., Van Rossum, T., Suttle, C. A., Brinkman, F. S. L., Isaac-Renton, J., Prystajecky, N. A., & Tang, P. (2016). A comprehensive method for amplicon-based and metagenomic characterization of viruses, bacteria, and eukaryotes in freshwater samples. Microbiome , 4 (1), 20. https://doi.org/10.1186/s40168-016-0166-1
Veglia, A. J., Bistolas, K. S. I., Voolstra, C. R., Hume, B. C. C., Planes, S., Allemand, D., Boissin, E., Wincker, P., Poulain, J., Moulin, C., Bourdin, G., Iwankow, G., Romac, S., Agostini, S., Banaigs, B., Boss, E., Bowler, C., Vargas, C. de, Douville, E., … Thurber, R. L. V. (2022). Endogenous viral elements reveal associations between a non-retroviral RNA virus and symbiotic dinoflagellate genomes(p. 2022.04.11.487905). bioRxiv. https://doi.org/10.1101/2022.04.11.487905
Wernersson, R. (2006). Virtual Ribosome—A comprehensive DNA translation tool with support for integration of sequence feature annotation. Nucleic Acids Research , 34 (suppl_2), W385–W388. https://doi.org/10.1093/nar/gkl252
Wernersson, R., & Pedersen, A. G. (2003). RevTrans: Multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Research , 31 (13), 3537–3539.
Xie, Y., Allaire, J. J., & Grolemund, G. (2018). R Markdown: The Definitive Guide . Chapman and Hall/CRC, Boca Raton, Florida. https://bookdown.org/yihui/rmarkdown/
Zayed, A. A., Wainaina, J. M., Dominguez-Huerta, G., Pelletier, E., Guo, J., Mohssen, M., Tian, F., Pratama, A. A., Bolduc, B., Zablocki, O., Cronin, D., Solden, L., Delage, E., Alberti, A., Aury, J.-M., Carradec, Q., da Silva, C., Labadie, K., Poulain, J., … Sullivan, M. B. (2022). Cryptic and abundant marine viruses at the evolutionary origins of Earth’s RNA virome. Science , 376 (6589), 156–162. https://doi.org/10.1126/science.abm5847
Conflict of Interest
The authors declare that they have no financial conflict of interest with the content of this article.
Authors’ Contributions
AJV, CBG, LHK conceived of the program with support from AMSC; CBG and LHK contributed R code used in the vAMPirus reports; RERV contributed R code and helped execute vAMPirus incorporation into Nextflow; CBG, LHK and AMSC processed samples and generated the RNA virus dataset; AJV designed the pipelines with input from CBG and LHK; AJV wrote bash and R code used in the program, analyzed data, and wrote the initial draft of the manuscript, with contributions by all authors.
Data availabilitySource code, scripts, and help documentation are available online at github.com/aveglia/vAMPirus. RNA virus sequencing libraries are available on NCBI SRA associated with the BioProject PRJNA923642 as well as in the vAMPirus Analysis Repository (doi.org/10.5281/zenodo.7574173). All non-read files required to reproduce all analyses and results described in this manuscript can be found on the vAMPirus Analysis Repository (doi.org/10.5281/zenodo.7574173).
ORCIDAJV - 0000-0003-3118-5127 RERV - 0000-0002-6229-3537 CGG - 0000-0001-5083-4570 LHK - 0000-0002-8086-5869 AMSC - 0000-0003-0137-5042