Figure 3 . Performance of rwTTD prediction in homogeneous population during cross-validation. a. Example terminated ratio curve at 0.0008 termination rate. b. Comparison between predicted curve and gold standard curve by different base learners at different termination rates. c. Cumulative error at different termination rates. d. Cumulative error with different numbers of training examples. e. Cumulative error with different numbers of predictive features. f. Cumulative error with different feature noise levels.
With the increase of examples, there is a steady decrease in the percent of error (Fig 3d, Fig. S1b, Fig. S3 ). This is expected as we have more training examples, the inference of the overall curve is improved. With 100 examples, the median error using cumulative errors are 19.84%, 22.92%, 20.22% for ExtraTreeRegressor, Linear Regression, and SVM respectively. In contrast, with 10,000 examples, the median errors using cumulative error is 6.81%, 7.95%, 6.28% for ExtraTreeRegressor, Linear Regression, and SVM, respectively. We consider this is caused by more stable performance and inference of parameters in models with more training examples. On the other hand, the number of predictive features does not affect performance (Fig. 3e, Fig. S1c, Fig. S4 ). Additionally, with a sufficient number of examples (5000), noise level on individual features does not affect model performance (Fig. 3f, Fig. S1d ,Fig. S5 ). The above results demonstrated the overall robust performance of the model when the patients are derived from the same population.