Results
The original search, performed on October 20, 2020, identified 3,323 potentially eligible reviews of which 419 SR were included in the final analysis (Fig 1). Of these, 414(207x2) and 384 (192x2) pairs of the reviews were eligible for the analysis of CoE and effect size, respectively. Total number of trials included in 414 reviews was 4217 (1814 before and 2403 after); mean number of trials per meta-analysis was 10 (minimum: 1, maximum:133). Total number of participants was 3,057,956; mean number of participants per meta-analysis was 10,506 (minimum:16; maximum:1,202,382). Interrater kappa agreement varied from 0.79 to 0.97.
Fig 2 shows comparison of CoE in the original and updated Cochrane reviews across of all categories of CoE (Fig2a) and very low/low to moderate/high (Fig 2b) according to GRADE criteria. Consistent with EBM principles, evidence judged to be of very low/low CoE had 2.1 (1.19 to 4.12; p=0.0065) times higher odds to be upgraded in the future studies than moderate/high CoE (Fig 2b). Similarly, across of all categories of CoE, the test for trend was highly significant, indicating an increased probability of change in CoE from very low to high CoE (p=0.0021 for linear trend). We observed no instance in which high or moderate quality evidence was re-assessed as very low quality evidence in the updated SR, while very low CoE was upgraded to moderate or high CoE in 9/39 of updated SR (Fig 2a).
However, we detected no effect of change in CoE on the magnitude of treatment effects [ROR=1.02 (95%CI: 0.74 to 1.39) for change of CoE from very low/low to moderate/high vs. 1.02 (95%CI: 0.44 to 2.37) for moderate/high to very low/low CoE]. Test between the subgroups was not significant (p=1). (Fig 3) Although, as explained earlier, from guidelines recommendations perspectives, GRADE typically groups CoE as moderate/high vs. low/very low, we also tried to compare the effect sizes at the two extremes of CoE: very low vs high. Because we observed no study with high CoE that changed into very low CoE (Fig 2a), ROR was impossible to calculate for this comparison.
Nevertheless, there was larger dispersion in ROR in meta-analyses where CoE changed from moderate/high to very low/low than in the opposite direction. This was probably driven by low power for the analysis instead of the hypothesis that effect size would be larger if CoE changed from moderate/high to very low/low than other way around. [We had half as many of meta-analyses available for the assessment of ROR based on change of CoE from moderate/high to very low/low (n=16) as those in which CoE changed from very low/low to moderate high (n=33).]
aROR was similar between the subgroups [median (IQR):1.12 (1.07 to 1.57) vs 1.21 (1.12 to 2.43)] (Fig 4a, Table 1). As in case of ROR, we observed larger dispersion in aROR in meta-analyses where CoE changed from moderate/high to very low/low than in the opposite direction (Fig 4 a, Fig 4b).
The meta-analyses with no change in CoE had similar ROR [ROR=1.01 (95%CI: 0.85 to 1.21)] (Fig 3b) and aROR [median (IQR):1.13 (1.04 to 1.66)] (Table 1, App Fig 4 and App Fig 4a) to those MAs in which CoE changed (Fig 4 and App Fig 4a). Inconsistency was large across of all meta-analytic estimates (I 2=99%). Likewise, the ratio of standard errors was > 1 [median: 1.09; IQR:0.72 to 1.46] indicating imprecision in the estimates.
Qualitative analysis indicated that direction of the effect changed in 6 SR/MAs only: two in the reviews in which CoE changed from very low/low to moderate/high (of which one was statistically significant ) and in 4 SR/Mas with no change in the assessment of CoE (of which one was statistically significant) (Fig 5, App Figs 12 and 13).
Sensitivity analyses for all defined subgroups showed no change in the results. In fact, when non-randomized studies or outliers were excluded from the analyses, no statistically significant changes were seen in any of the analyses (Appendix).