Figure 5. vAMPirus-generated dinoRNAV major capsid protein gene
ASV and aminotype alpha (I, II) and beta (III, IV) diversity results
from stony coral colonies (Acropora sp., Pocilloporaspecies complex) and corallivorous (coral-eating) fish feces.Plots include three sample types: 1. Acropora biopsies (blue,
triangle), 2. Corallivore feces (gray, x), and 3. Pocilloporabiopsies (red, diamond). Letters beneath x axis labels on richness box
plots (I, II) indicate statistically different groups. ASV and aminotype
based NMDS plots (III, IV) were generated with Bray Curtis distances
(stress values of 0.04 and 0.03, respectively).
Discussion
Targeted gene sequencing is increasingly being applied to explore
spatiotemporal patterns of viral diversity (Adriaenssens & Cowan, 2014;
Finke & Suttle, 2019; Frantzen & Holo, 2019; Grupstra et al., 2022;
Gustavsen & Suttle, 2021; Howe-Kerr et al., 2022; Y. Li et al., 2018;
Montalvo-Proaño et al., 2017; Prodinger et al., 2020; Short et al.,
2010; Tong et al., 2016). The field of virology can now greatly benefit
from the development of readily standardizable and reproducible
pipelines for analyzing amplicon sequence datasets. Here, we present
vAMPirus; a freely available, powerful, and flexible bioinformatics tool
that streamlines the processing, analysis, and visualization of virus
gene amplicon data. The availability of diverse bioinformatics
approaches and tools within the vAMPirus program (e.g., ASV
calling, clustering, translation, phylogenetic clustering) empowers the
user to adapt and set informed standards for their study system and
easily share these standards with colleagues. With a user-friendly
design and robust documentation, vAMPirus democratizes comprehensive
virus amplicon sequencing analyses, making it a timely and valuable tool
for virologists.
To inform virus amplicon data analyses, virologists have primarily
relied on pipelines and tutorials geared towards bacterial or
microeukaryote amplicon data (e.g., mothur (Schloss, 2020) and QIIME2
(Bolyen et al., 2019)). Although valuable insights have been made using
these resources, an accessible virus-focused amplicon analysis pipeline
will advance the field by offering via (1) automated pipelines that
standardize approaches for viral amplicon analyses (e.g., ASV and
aminotype calling); (2) non-cluster-based alternatives to partitioning
virus gene sequences (e.g., MED and phylogrouping); and (3)
virus-focused taxonomy databases. Virus amplicon analyses have
traditionally applied de novo clustering of marker gene sequences
into de novo OTUs based on a percent identity value (i.e., 97%
nucleotide identity, Callahan et al., 2017). However, clustering virus
amplicons into biologically accurate de novo OTUs is challenging
as the optimal clustering percentage is often unknown. vAMPirus provides
users with the opportunity to transition from traditional de novoOTUs in virus amplicon sequencing analyses to using ASVs and aminotypes.
We have illustrated here that ASV and aminotype-based analyses generally
recapitulate findings generated via de novo OTU-based analyses
(Figures 3, 4; Supplemental Table S1), while enabling reproducibility
and cross-study comparisons (Callahan et al., 2017). Running analyses of
amino acid and nucleotide sequence data in tandem, which is possible in
vAMPirus, can aid in resolving virus phylogenies and reveal
non-synonymous mutations that indicate virus protein property
variability within a community (DeFilippis & Villarreal, 2000). This
synergistic approach has been effective in developing dinoRNAVs and
their dinoflagellate hosts (family Symbiodiniaceae, endosymbionts of
stony corals) as a nascent study system. To characterize dinoRNAVs,
studies have used the mcp gene, which has a high mutation rate
and is hypothesized to be important in host cell attachment (Tomaru et
al., 2004). vAMPirus aminotyping uncovered non-synonymous mutations in
dinoRNAV mcp sequences, which may represent phenotypic
differences that correlate with the distribution of host lineages across
reefs (Grupstra et al., 2022; Howe-Kerr et al., 2022, this study).
Aminotyping also effectively reduced noise from high mutation rates in
ASV results, revealing temperature-driven increases in dinoRNAV
infection productivity and community diversity across time and space
(Grupstra et al. 2022: time only, Howe-Kerr et al., 2022). By making
viral protein sequence analyses readily accessible in an amplicon
sequence analysis workflow, vAMPirus helps reveal biological patterns in
DNA and highly mutable RNA virus lineages by increasing signal-to-noise
ratio in results (through collapse of synonymous nucleotide mutations,
(Wernersson & Pedersen, 2003).
The increasing application of amplicon sequencing to the study of
microbial diversity and dynamics has spurred efforts to improve the
proficiency of tools that parse marker gene data. Such tools include the
programs TreeCluster (Balaban et al., 2019) and oligotyping (Eren et
al., 2015), which were developed as de novo clustering
alternatives for partitioning genetic sequences into distinct units. In
vAMPirus, these programs are utilized to assign ASVs and aminotypes to
phylo- or MED groups based on user-set criteria (see Section 2.2.2).
Assignment of ASV/aminotype sequences to groups rather than use of
cluster representative sequences in analyses (such as, in the case ofde novo OTUs and cASVs, Callahan et al., 2017) is done by
vAMPirus to maintain reproducibility and comparability of results, while
still permitting virus sequence classification into phylogenetically or
ecologically distinct groups. These grouping approaches are instrumental
for investigators because they can expose underlying patterns obscured
by high sequence diversity (e.g., lactococcal phage phylogrouping
results, Section 3, Figure 4). Phylogeny-based sequence clustering with
TreeCluster has been applied to assess the diversity of microorganisms
(and barley, Chen et al., 2022) and has been used to resolved virus
transmission dynamics (HIV, Balaban et al., 2019; SARS-CoV-2, Plyusnin
et al., 2022) and phylogenies (Ni et al., 2023). However, TreeCluster’s
potential utility for virus amplicon analyses is, for the most part,
untapped. The inclusion of TreeCluster in the vAMPirus pipeline also
opens the door to epidemiological insights, such as virus genetic
linkage, transmission dynamics, and subpopulation mixing, from viral
datasets (Balaban et al., 2019; Bezemer et al., 2015; Eshleman et al.,
2011; Hué et al., 2014). Similarly, the program oligotyping developed by
Eren et al. (2015) has been applied extensively to investigate
microorganism diversity from marker gene data (cited 332 times as of
February 2, 2023, Web of Science). However, only one published study has
applied Minimum Entropy Decomposition sequence clustering with
oligotyping to virus amplicon data (Needham et al., 2017). The MED
grouping with oligotyping option provided by vAMPirus is a powerful
approach for deciphering virus community diversity because it enables
the grouping of sequences based on potential physiologically and/or
ecologically relevant similarities. For example, users can identify gene
sequence positions with non-synonymous mutations via aminotyping and
then specify these positions in MED grouping to partition sequences into
units of similar protein phenotypes (i.e., host cell attachment; see
Harvey et al., 2021). The option to incorporate cutting-edge
bioinformatic approaches, such as phylogrouping and MED grouping, into
analyses of virus amplicon data makes vAMPirus a highly useful
“raw-reads-to-results” environmental virology workflow.
vAMPirus is an easy-to-use, open-source, and flexible tool that
streamlines and simplifies the process of analyzing viral amplicon data.
vAMPirus is designed to be community-driven; new features and programs
(e.g., built-in lineage specific configuration files or databases, new
bioinformatic tools) can easily be implemented at the request of
investigators or when advances in best practices are made. vAMPirus
advances studies of viral community diversity by facilitating informed
analyses of amplicon sequence data with its DataCheck and Analyze
pipelines in a standardized and reproducible manner.
Acknowledgments
The authors would like to thank Jan F. Finke, Curtis A. Suttle, Julia A.
Gustavsen, Cyril Frantzen, Hedge Holo, Florian Prodinger and Hiroyuki
Ogata (and their respective co-authors) for providing raw virus amplicon
sequence data for vAMPirus testing and for permission to reprint
original figures. We thank Samantha R. Coy for feedback and insight
during early stages of vAMPirus development, and Nikolaos Schizas for
access to computational resources during vAMPirus development. This work
represents a contribution of the Moorea Coral Reef (MCR) LTER Site (NSF
OCE 16–37396). This research was funded by a U.S. National Science
Foundation Grant OCE 22-24354 (and earlier awards) to the Moorea Coral
Reef LTER as well as a generous gift from the Gordon and Betty Moore
Foundation. Research was completed under permits issued by the
Territorial Government of French Polynesia (Délégation à la Recherche)
and the Haut-Commissariat de la République en Polynésie Francaise (DTRT)
(MCR LTER Protocole d’Accueil 2005–2022; Adrienne
Correa Protocole d’Accueil 2013-2019), and we thank
the Délégation à la Recherche and DTRT for their continued support.
Start-up funds from Rice University, a NSF CAREER Award (OCE-2145472)
and an Early-Career Research Fellowship (#2000009651) from the Gulf
Research Program of the National Academies of Sciences to AMSC also
contributed to this work. Additional funding was provided by Lewis and
Clark Grants for Exploration to LHK and CGG, and Wagoner Foreign Study
awards and to LHK, CGG, and AJV. The Kirk W. Dotson Endowed Graduate
Fellowship in Ecology and Evolutionary Biology also helped support AJV
as this work was conducted. RERV acknowledges funding from the European
Union’s Horizon 2020 research and innovation programme under the Marie
Skłodowska-Curie grant agreement No. 764840 (ITN IGNITE).
References
Adriaenssens, E. M., & Cowan, D. A. (2014). Using signature genes as
tools to assess environmental viral ecology and diversity. Applied
and Environmental Microbiology , 80 (15), 4470–4480.
https://doi.org/10.1128/AEM.00878-14
Balaban, M., Moshiri, N., Mai, U., Jia, X., & Mirarab, S. (2019).
TreeCluster: Clustering biological sequences using phylogenetic trees.PLOS ONE , 14 (8), e0221068.
https://doi.org/10.1371/journal.pone.0221068
Bezemer, D., Cori, A., Ratmann, O., Sighem, A. van, Hermanides, H. S.,
Dutilh, B. E., Gras, L., Faria, N. R., Hengel, R. van den, Duits, A. J.,
Reiss, P., Wolf, F. de, Fraser, C., & Cohort, A. observational. (2015).
Dispersion of the HIV-1 Epidemic in Men Who Have Sex with Men in the
Netherlands: A Combined Mathematical Model and Phylogenetic Analysis.PLOS Medicine , 12 (11), e1001898.
https://doi.org/10.1371/journal.pmed.1001898
Bigot, T., Temmam, S., Pérot, P., & Eloit, M. (2020). RVDB-prot,
a reference viral protein database and its HMM profiles (8:530).
F1000Research. https://doi.org/10.12688/f1000research.18776.2
Bolyen, E., Rideout, J. R., Dillon, M. R., Bokulich, N. A., Abnet, C.
C., Al-Ghalith, G. A., Alexander, H., Alm, E. J., Arumugam, M., Asnicar,
F., Bai, Y., Bisanz, J. E., Bittinger, K., Brejnrod, A., Brislawn, C.
J., Brown, C. T., Callahan, B. J., Caraballo-Rodríguez, A. M., Chase,
J., … Caporaso, J. G. (2019). Reproducible, interactive, scalable
and extensible microbiome data science using QIIME 2. Nature
Biotechnology , 37 (8), 852–857.
https://doi.org/10.1038/s41587-019-0209-9
Braga, L. P. P., Spor, A., Kot, W., Breuil, M.-C., Hansen, L. H.,
Setubal, J. C., & Philippot, L. (2020). Impact of phages on soil
bacterial communities and nitrogen availability under different assembly
scenarios. Microbiome , 8 (1), 52.
https://doi.org/10.1186/s40168-020-00822-z
Breitbart, M., Bonnain, C., Malki, K., & Sawaya, N. A. (2018). Phage
puppet masters of the marine microbial realm. Nature
Microbiology , 3 (7), 754–766.
Brister, J. R., Ako-Adjei, D., Bao, Y., & Blinkova, O. (2015). NCBI
viral genomes resource. Nucleic Acids Research ,43 (Database issue), D571-577. https://doi.org/10.1093/nar/gku1207
Buchfink, B., Xie, C., & Huson, D. H. (2015). Fast and sensitive
protein alignment using DIAMOND. Nature Methods , 12 (1),
Article 1. https://doi.org/10.1038/nmeth.3176
Callahan, B. J., McMurdie, P. J., & Holmes, S. P. (2017). Exact
sequence variants should replace operational taxonomic units in
marker-gene data analysis. The ISME Journal , 11 (12),
Article 12. https://doi.org/10.1038/ismej.2017.119
Capella-Gutiérrez, S., Silla-Martínez, J. M., & Gabaldón, T. (2009).
trimAl: A tool for automated alignment trimming in large-scale
phylogenetic analyses. Bioinformatics , 25 (15), 1972–1973.
https://doi.org/10.1093/bioinformatics/btp348
Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: An ultra-fast
all-in-one FASTQ preprocessor. Bioinformatics , 34 (17),
i884–i890. https://doi.org/10.1093/bioinformatics/bty560
Chen, Y.-Y., Schreiber, M., Bayer, M. M., Dawson, I. K., Hedley, P. E.,
Lei, L., Akhunova, A., Liu, C., Smith, K. P., Fay, J. C., Muehlbauer, G.
J., Steffenson, B. J., Morrell, P. L., Waugh, R., & Russell, J. R.
(2022). The evolutionary patterns of barley pericentromeric chromosome
regions, as shaped by linkage disequilibrium and domestication.The Plant Journal , 111 (6), 1580–1594.
https://doi.org/10.1111/tpj.15908
Correa, A. M. S., Howard-Varona, C., Coy, S. R., Buchan, A., Sullivan,
M. B., & Weitz, J. S. (2021). Revisiting the rules of life for viruses
of microorganisms. Nature Reviews Microbiology , 19 (8),
501–513. https://doi.org/10.1038/s41579-021-00530-x
Correa, A. M. S., Welsh, R. M., & Vega Thurber, R. L. (2013). Unique
nucleocytoplasmic dsDNA and +ssRNA viruses are associated with the
dinoflagellate endosymbionts of corals. The ISME Journal ,7 (1), Article 1. https://doi.org/10.1038/ismej.2012.75
Darriba, D., Posada, D., Kozlov, A. M., Stamatakis, A., Morel, B., &
Flouri, T. (2020). ModelTest-NG: A New and Scalable Tool for the
Selection of DNA and Protein Evolutionary Models. Molecular
Biology and Evolution , 37 (1), 291–294.
https://doi.org/10.1093/molbev/msz189
DeFilippis, V. R., & VillarreaI, L. P. (2000). An Introduction to the
Evolutionary Ecology of Viruses. Viral Ecology , 125–208.
https://doi.org/10.1016/B978-012362675-2/50005-7
Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E.,
& Notredame, C. (2017). Nextflow enables reproducible computational
workflows. Nature Biotechnology , 35 (4), Article 4.
https://doi.org/10.1038/nbt.3820
Domingo, E., & Perales, C. (2019). Viral quasispecies. PLOS
Genetics , 15 (10), e1008271.
https://doi.org/10.1371/journal.pgen.1008271
Edgar, R. C. (2010). Search and clustering orders of magnitude faster
than BLAST. Bioinformatics , 26 (19), 2460–2461.
https://doi.org/10.1093/bioinformatics/btq461
Edgar, R. C. (2016a). UCHIME2: Improved chimera prediction for
amplicon sequencing (p. 074252). bioRxiv.
https://doi.org/10.1101/074252
Edgar, R. C. (2016b). UNOISE2: Improved error-correction for
Illumina 16S and ITS amplicon sequencing (p. 081257). bioRxiv.
https://doi.org/10.1101/081257
Edgar, R. C. (2021). MUSCLE v5 enables improved estimates of
phylogenetic tree confidence by ensemble bootstrapping (p.
2021.06.20.449169). bioRxiv. https://doi.org/10.1101/2021.06.20.449169
Eren, A. M., Morrison, H. G., Lescault, P. J., Reveillaud, J., Vineis,
J. H., & Sogin, M. L. (2015). Minimum entropy decomposition:
Unsupervised oligotyping for sensitive partitioning of high-throughput
marker gene sequences. The ISME Journal , 9 (4), Article 4.
https://doi.org/10.1038/ismej.2014.195
Eshleman, S. H., Hudelson, S. E., Redd, A. D., Wang, L., Debes, R.,
Chen, Y. Q., Martens, C. A., Ricklefs, S. M., Selig, E. J., Porcella, S.
F., Munshaw, S., Ray, S. C., Piwowar-Manning, E., McCauley, M.,
Hosseinipour, M. C., Kumwenda, J., Hakim, J. G., Chariyalertsak, S., de
Bruyn, G., … Hughes, J. P. (2011). Analysis of Genetic Linkage of
HIV From Couples Enrolled in the HIV Prevention Trials Network 052
Trial. The Journal of Infectious Diseases , 204 (12),
1918–1926. https://doi.org/10.1093/infdis/jir651
Finke, J. F., & Suttle, C. A. (2019). The Environment and Cyanophage
Diversity: Insights From Environmental Sequencing of DNA Polymerase.Frontiers in Microbiology , 10 .
https://www.frontiersin.org/article/10.3389/fmicb.2019.00167
Frantzen, C. A., & Holo, H. (2019). Unprecedented Diversity of
Lactococcal Group 936 Bacteriophages Revealed by Amplicon Sequencing of
the Portal Protein Gene. Viruses , 11 (5), Article 5.
https://doi.org/10.3390/v11050443
Fu, L., Niu, B., Zhu, Z., Wu, S., & Li, W. (2012). CD-HIT: Accelerated
for clustering the next-generation sequencing data.Bioinformatics , 28 (23), 3150–3152.
https://doi.org/10.1093/bioinformatics/bts565
Grupstra, C. G. B., Lemoine, N. P., Cook, C., & Correa, A. M. S.
(2022). Thank you for biting: Dispersal of beneficial microbiota through
“antagonistic” interactions. Trends in Microbiology ,30 (10), 930–939. https://doi.org/10.1016/j.tim.2022.03.006
Grupstra, C. G. B., Rabbitt, K. M., Howe-Kerr, L. I., & Correa, A. M.
S. (2021). Fish predation on corals promotes the dispersal of coral
symbionts. Animal Microbiome , 3 (1), 25.
https://doi.org/10.1186/s42523-021-00086-4
Grupstra, C. G., Howe-Kerr, L. I., Veglia, A. J., Bryant, R. L., Coy, S.
R., Blackwelder, P. L., & Correa, A. (2022). Thermal stress triggers
productive viral infection of a key coral reef symbiont. The ISME
Journal , 1–12.
Gustavsen, J. A., & Suttle, C. A. (2021). Role of Phylogenetic
Structure in the Dynamics of Coastal Viral Assemblages. Applied
and Environmental Microbiology , 87 (11), e02704-20.
https://doi.org/10.1128/AEM.02704-20
Harvey, W. T., Carabelli, A. M., Jackson, B., Gupta, R. K., Thomson, E.
C., Harrison, E. M., Ludden, C., Reeve, R., Rambaut, A., Peacock, S. J.,
& Robertson, D. L. (2021). SARS-CoV-2 variants, spike mutations and
immune escape. Nature Reviews Microbiology , 19 (7), Article
7. https://doi.org/10.1038/s41579-021-00573-0
Howe-Kerr, L. I. (2022). Viruses of a key coral symbiont exhibit
temperature-driven productivity across a reefscape .
https://doi.org/10.21203/rs.3.rs-1899377/v1
Hué, S., Brown, A. E., Ragonnet-Cronin, M., Lycett, S. J., Dunn, D. T.,
Fearnhill, E., Dolling, D. I., Pozniak, A., Pillay, D., Delpech, V. C.,
& Leigh Brown, A. J. (2014). Phylogenetic analyses reveal HIV-1
infections between men misclassified as heterosexual transmissions.AIDS , 28 (13), 1967.
https://doi.org/10.1097/QAD.0000000000000383
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A., &
Jermiin, L. S. (2017). ModelFinder: Fast model selection for accurate
phylogenetic estimates. Nature Methods , 14 (6), Article 6.
https://doi.org/10.1038/nmeth.4285
Labadie, T., Batéjat, C., Leclercq, I., & Manuguerra, J.-C. (2020).
Historical Discoveries on Viruses in the Environment and Their Impact on
Public Health. Intervirology , 63 (1–6), 17–32.
https://doi.org/10.1159/000511575
Li, W., & Godzik, A. (2006). Cd-hit: A fast program for clustering and
comparing large sets of protein or nucleotide sequences.Bioinformatics , 22 (13), 1658–1659.
https://doi.org/10.1093/bioinformatics/btl158
Li, Y., Hingamp, P., Watai, H., Endo, H., Yoshida, T., & Ogata, H.
(2018). Degenerate PCR Primers to Reveal the Diversity of Giant Viruses
in Coastal Waters. Viruses , 10 (9), 496.
https://doi.org/10.3390/v10090496
Metcalf, T. G., Melnick, J. L., & Estes, M. K. (1995). ENVIRONMENTAL
VIROLOGY: From Detection of Virus in Sewage and Water by Isolation to
Identification by Molecular Biology—A Trip of Over 50 Years.Annual Review of Microbiology , 49 (1), 461–487.
https://doi.org/10.1146/annurev.mi.49.100195.002333
Minh, B. Q., Schmidt, H. A., Chernomor, O., Schrempf, D., Woodhams, M.
D., von Haeseler, A., & Lanfear, R. (2020). IQ-TREE 2: New Models and
Efficient Methods for Phylogenetic Inference in the Genomic Era.Molecular Biology and Evolution , 37 (5), 1530–1534.
https://doi.org/10.1093/molbev/msaa015
Montalvo-Proaño, J., Buerger, P., Weynberg, K. D., & van Oppen, M. J.
H. (2017). A PCR-Based Assay Targeting the Major Capsid Protein Gene of
a Dinorna-Like ssRNA Virus That Infects Coral Photosymbionts.Frontiers in Microbiology , 8 .
https://www.frontiersin.org/articles/10.3389/fmicb.2017.01665
Needham, D. M., Sachdeva, R., & Fuhrman, J. A. (2017). Ecological
dynamics and co-occurrence among marine phytoplankton, bacteria and
myoviruses shows microdiversity matters. The ISME Journal ,11 (7), Article 7. https://doi.org/10.1038/ismej.2017.29
Ni, X.-B., Cui, X.-M., Liu, J.-Y., Ye, R.-Z., Wu, Y.-Q., Jiang, J.-F.,
Sun, Y., Wang, Q., Shum, M. H.-H., Chang, Q.-C., Zhao, L., Han, X.-H.,
Ma, K., Shen, S.-J., Zhang, M.-Z., Guo, W.-B., Zhu, J.-G., Zhan, L., Li,
L.-J., … Cao, W.-C. (2023). Metavirome of 31 tick species
provides a compendium of 1,801 RNA virus genomes. Nature
Microbiology , 8 (1), Article 1.
https://doi.org/10.1038/s41564-022-01275-w
O’Leary, N. A., Wright, M. W., Brister, J. R., Ciufo, S., Haddad, D.,
McVeigh, R., Rajput, B., Robbertse, B., Smith-White, B., Ako-Adjei, D.,
Astashyn, A., Badretdin, A., Bao, Y., Blinkova, O., Brover, V.,
Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., … Pruitt, K. D.
(2016). Reference sequence (RefSeq) database at NCBI: Current status,
taxonomic expansion, and functional annotation. Nucleic Acids
Research , 44 (D1), D733-745. https://doi.org/10.1093/nar/gkv1189
Paez-Espino, D., Chen, I.-M. A., Palaniappan, K., Ratner, A., Chu, K.,
Szeto, E., Pillay, M., Huang, J., Markowitz, V. M., Nielsen, T.,
Huntemann, M., K. Reddy, T. B., Pavlopoulos, G. A., Sullivan, M. B.,
Campbell, B. J., Chen, F., McMahon, K., Hallam, S. J., Denef, V.,
… Kyrpides, N. C. (2017). IMG/VR: A database of cultured and
uncultured DNA Viruses and retroviruses. Nucleic Acids Research ,45 (D1), gkw1030. https://doi.org/10.1093/nar/gkw1030
Plyusnin, I., Truong Nguyen, P. T., Sironen, T., Vapalahti, O., Smura,
T., & Kant, R. (2022). ClusTRace, a bioinformatic pipeline for
analyzing clusters in virus phylogenies. BMC Bioinformatics ,23 (1), 196. https://doi.org/10.1186/s12859-022-04709-8
Prodinger, F., Endo, H., Gotoh, Y., Li, Y., Morimoto, D., Omae, K.,
Tominaga, K., Blanc-Mathieu, R., Takano, Y., Hayashi, T., Nagasaki, K.,
Yoshida, T., & Ogata, H. (2020). An Optimized Metabarcoding Method for
Mimiviridae. Microorganisms , 8 (4), Article 4.
https://doi.org/10.3390/microorganisms8040506
Rognes, T., Flouri, T., Nichols, B., Quince, C., & Mahé, F. (2016).
VSEARCH: A versatile open source tool for metagenomics. PeerJ ,4 , e2584. https://doi.org/10.7717/peerj.2584
Schloss, P. D. (2020). Reintroducing mothur: 10 Years Later.Applied and Environmental Microbiology , 86 (2).
https://doi.org/10.1128/AEM.02343-19
Schoch, C. L., Ciufo, S., Domrachev, M., Hotton, C. L., Kannan, S.,
Khovanskaya, R., Leipe, D., Mcveigh, R., O’Neill, K., Robbertse, B.,
Sharma, S., Soussov, V., Sullivan, J. P., Sun, L., Turner, S., &
Karsch-Mizrachi, I. (2020). NCBI Taxonomy: A comprehensive update on
curation, resources and tools. Database: The Journal of Biological
Databases and Curation , 2020 , baaa062.
https://doi.org/10.1093/database/baaa062
Shannon, C. E. (1948). A mathematical theory of communication. The
Bell System Technical Journal , 27 (3), 379–423.
https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Short, S. M., Chen, F., & Wilhelm, S. (2010). The construction and
analysis of marker gene libraries. Manual of Aquatic Viral
Ecology , 82–91.
Suttle, C. A. (2007). Marine viruses—Major players in the global
ecosystem. Nature Reviews Microbiology , 5 (10), 801–812.
https://doi.org/10.1038/nrmicro1750
Thurber, R. V., Payet, J. P., Thurber, A. R., & Correa, A. M. S.
(2017). Virus–host interactions and their roles in coral reef health
and disease. Nature Reviews Microbiology , 15 (4), 205–216.
https://doi.org/10.1038/nrmicro.2016.176
Tomaru, Y., Katanozaka, N., Nishida, K., Shirai, Y., Tarutani, K.,
Yamaguchi, M., & Nagasaki, K. (2004). Isolation and characterization of
two distinct types of HcRNAV, a single-stranded RNA virus infecting the
bivalve-killing microalga Heterocapsa circularisquama. Aquatic
Microbial Ecology , 34 (3), 207–218.
https://doi.org/10.3354/ame034207
Tong, Y., Liu, B., Liu, H., Zheng, H., Gu, J., Liu, H., Lin, M., Ding,
Y., Song, C., & Li, Y. (2016). New universal primers for genotyping and
resistance detection of low HBV DNA levels. Medicine ,95 (33), e4618. https://doi.org/10.1097/MD.0000000000004618
Uyaguari-Diaz, M. I., Chan, M., Chaban, B. L., Croxen, M. A., Finke, J.
F., Hill, J. E., Peabody, M. A., Van Rossum, T., Suttle, C. A.,
Brinkman, F. S. L., Isaac-Renton, J., Prystajecky, N. A., & Tang, P.
(2016). A comprehensive method for amplicon-based and metagenomic
characterization of viruses, bacteria, and eukaryotes in freshwater
samples. Microbiome , 4 (1), 20.
https://doi.org/10.1186/s40168-016-0166-1
Veglia, A. J., Bistolas, K. S. I., Voolstra, C. R., Hume, B. C. C.,
Planes, S., Allemand, D., Boissin, E., Wincker, P., Poulain, J., Moulin,
C., Bourdin, G., Iwankow, G., Romac, S., Agostini, S., Banaigs, B.,
Boss, E., Bowler, C., Vargas, C. de, Douville, E., … Thurber, R.
L. V. (2022). Endogenous viral elements reveal associations
between a non-retroviral RNA virus and symbiotic dinoflagellate genomes(p. 2022.04.11.487905). bioRxiv.
https://doi.org/10.1101/2022.04.11.487905
Wernersson, R. (2006). Virtual Ribosome—A comprehensive DNA
translation tool with support for integration of sequence feature
annotation. Nucleic Acids Research , 34 (suppl_2),
W385–W388. https://doi.org/10.1093/nar/gkl252
Wernersson, R., & Pedersen, A. G. (2003). RevTrans: Multiple alignment
of coding DNA from aligned amino acid sequences. Nucleic Acids
Research , 31 (13), 3537–3539.
Xie, Y., Allaire, J. J., & Grolemund, G. (2018). R Markdown: The
Definitive Guide . Chapman and Hall/CRC, Boca Raton, Florida.
https://bookdown.org/yihui/rmarkdown/
Zayed, A. A., Wainaina, J. M., Dominguez-Huerta, G., Pelletier, E., Guo,
J., Mohssen, M., Tian, F., Pratama, A. A., Bolduc, B., Zablocki, O.,
Cronin, D., Solden, L., Delage, E., Alberti, A., Aury, J.-M., Carradec,
Q., da Silva, C., Labadie, K., Poulain, J., … Sullivan, M. B.
(2022). Cryptic and abundant marine viruses at the evolutionary origins
of Earth’s RNA virome. Science , 376 (6589), 156–162.
https://doi.org/10.1126/science.abm5847
Conflict of Interest
The authors declare that they have no financial conflict of interest
with the content of this article.
Authors’ Contributions
AJV, CBG, LHK conceived of the program with support from AMSC; CBG and
LHK contributed R code used in the vAMPirus reports; RERV contributed R
code and helped execute vAMPirus incorporation into Nextflow; CBG, LHK
and AMSC processed samples and generated the RNA virus dataset; AJV
designed the pipelines with input from CBG and LHK; AJV wrote bash and R
code used in the program, analyzed data, and wrote the initial draft of
the manuscript, with contributions by all authors.
Data availabilitySource code, scripts, and help documentation are available online at
github.com/aveglia/vAMPirus. RNA virus sequencing libraries are
available on NCBI SRA associated with the BioProject PRJNA923642 as
well as in the vAMPirus Analysis Repository
(doi.org/10.5281/zenodo.7574173). All non-read files required to
reproduce all analyses and results described in this manuscript can be
found on the vAMPirus Analysis Repository
(doi.org/10.5281/zenodo.7574173).
ORCIDAJV -
0000-0003-3118-5127
RERV -
0000-0002-6229-3537
CGG -
0000-0001-5083-4570
LHK -
0000-0002-8086-5869
AMSC - 0000-0003-0137-5042