Introduction
SARS-CoV-2 has spread across 188 countries/regions within the first six months of COVID-19 pandemic infecting more than 354 million people (Dong, Du, & Gardner, 2020). This highly infectious virus poses a single-stranded positive-sense RNA genome of nearly 30 kbp (Mousavizadeh & Ghasemi, 2020). Both synonymous and non-synonymous mutations were identified in the genomic region that code for non-structural proteins (NSP1-16), structural proteins (spike, membrane, envelope, and nucleocapsid proteins), and/or seven other accessory proteins (ORF3a, ORF6, ORF7a, ORF7b, ORF8a, ORF8b, ORF8, and ORF10) (M. R. Islam et al., 2020; Kamitani, 2020; Liu, Fung, Chong, Shukla, & Hilgenfeld, 2014; Ou et al., 2020). Researchers have demonstrated that the predominant mutations may attribute to virulence (Alam, Islam, Hasan, et al., 2020; Rahman et al., 2020; Q. Wang et al., 2020). The virus has been classified into six clades namely GH, GR, G, V, L, and S by the global initiative on sharing all influenza data (GISAID) (Shu & McCauley, 2017) by the clustered, co-evolving, and clade-featured point mutations.
The mutations at position C241T along with C3037T, C14408T (RdRp:p.P323L), and A23403G (S:p.D614G) was referred as G clade. Additional mutation to the G clade at N protein:p.RG203-204KR (GGG28881-28883AAC) and ORF3a:p.Q57H (G25563T) refers to GR and GH clade, respectively. The V clade was classified by co-evolving mutations at G11083T (NSP6:p.L37F) and G26144T (ORF3a:p.G251V) where S clade strains contain C8782T and T28144C (NS8:p.L84S) variations, respectively. The L clade strains are the original or wild version for the featured mutations of five clades (Mercatelli & Giorgi, 2020).
Previous studies showed the prevalence of phylogenetic clades were different by regions and times and were closely related to variable death-case ratio (Alam et al., 2020; Toyoshima et al 2020). On the eve of pandemic, G clade variant was predominant in Europe (Korber et al., 2020) and USA (Brufsky, 2020) where this clade caused high mortality in USA. This variant has gradually been spreading in Southeast Asia (Alam et al., 2020; Islam et al., 2020) and Oceania (Mercatelli & Giorgi, 2020). On the contrary, GR and GH clades emerged at the end of February 2020 and GR mutant are now the leading type that cause more than one-third of infection globally (Mercatelli & Giorgi, 2020). Therefore, it is indispensable to identify the circulating clades in a specific region. Besides, several reports speculated the occurrence of SARS-CoV-2 reinfection by phylogenetically different strains that belongs to separate clades (Li et al., 2020; To et al., 2020). The dominance of a particular viral clade over others might determine the virulence, disease severity, and infection dynamics (Alam et al., 2020). However, the implications of different clades on effective drug and vaccine development is yet to be clearly elucidated (Chellapandi & Saranya, 2020).
The identification of phylogenetic clades requires the identification of specific mutations into viral genome. This identification is performed by the whole genome sequence through NGS technique that has now scaled up the deposited sequences number in GISAID to 139,000 as of October 6, 2020. Another high-throughput NGS alternative is based on clade-based genetic barcoding that targets PCR amplicons encompassing the featured mutation as described by Guan et al. (2020). However, this state of art technique has limited access to most laboratories in low-income countries. A short-throughput and small-scale genotyping would be the Sanger based targeted sequencing approach (Alam, Islam, Rahman, Islam, & Hossain, 2020), but this is labor intensive, time-consuming, inconvenient, and difficult to perform at low cost. Therefore, we have hardly observed the worldwide distribution of circulating clades in many countries, like Afghanistan, Maldives, Iraq, Syria, Yemen, Ethiopia, Sudan, Zimbabwe, Bolivia, Paraguay, and Chile, most probably due to the lack of sequencing facilities and appropriate technical personnel to perform this state-of-the-art technique. PCR-based point mutation discriminating technique, which is also known as the amplified refractory mutation system (ARMS), has been proven to be useful in identifying subtypes or clades of other respiratory viruses previously (Brister, Barnum, Reedy, Chambers, & Pusterla, 2019; Lee, Kim, Shin, & Song, 2016; W. Wang et al., 2009). In this study, we aimed to develop and validate an ARMS-based novel multiplex-PCR to identify the clade-specific point mutations of the circulating SARS-CoV-2 clades.