Data analyses and statistics
Descriptive statistics were used to describe the patient demographic, clinical and laboratory characteristics. Continuous variables (e.g. age) were expressed as median ± SD and were compared using Mann Whitney U test. Categorical variables were expressed as numbers and percentages and were compared using ꭓ2 or Fisher’s exact test. Correlation and agreement between RAT and RT-qPCR results were calculated using Pearson’s correlation (r) and Cohn’s kappa (κ), respectively (Watson & Petrie, 2010). Measurements of diagnostic performance of RAT (sensitivity, specificity, positive predictive value, negative predictive value, accuracy and likelihood ratio) for the whole subjects and subject’s subgroups were calculated on contingency tables containing the numbers of each outcome. The confidence intervals (CI) were calculated using the Wilson-Brown method (Brown, Cai, & DasGupta, 2001). Participant’s categories based on Ct values were defined following a previous report (Nalumansi et al., 2020). Receiver operating characteristic curve (ROC) was generated to provide another assessment for the diagnostic power of the RAT. These two analyses were done using GraphPad Prism version 8.0.0 for Windows, GraphPad Software, San Diego, California USA, (www.graphpad.com). To investigate whether combining measurements of blood parameters would by any means enhance the predictive accuracy of the RAT and thus raises its clinical utility, a support vector machine (SVM) model with Monte-Carlo cross validation was applied as described previously (de Araujo et al., 2019) and the performance of top ranked combination (best model) was evaluated for sensitivity, specificity and accuracy using class probability analyses. This analysis was done on data from 68 subjects (the other 15 subjects had no data on any of the laboratory feature). Random forest classification was utilized to reveal the demographic and clinical parameters that are most important in determining individuals with positive and negative results for both RAT and RT-qPCR. In both SVM and random forest models, singular value decomposition method was used to impute the missing values (Stacklies, Redestig, Scholz, Walther, & Selbig, 2007). These analyses were done using Metaboanalyst online server (Pang, Chong, Li, & Xia, 2020).