Kelly Stevens

and 6 more

Objective: To develop a prediction model to predict surgical re-intervention within two years after endometrial ablation (EA) by using a random forest technique (RF). The performance of the developed prediction model was then compared with a previously published multivariate logistic regression model (LR) (1). Design: Retrospective cohort study. Setting: Data from two non-university teaching hospitals in the Netherlands were used. Population: 446 pre-menopausal women who have had an EA for heavy menstrual bleeding between January 2004 and April 2013. Methods: The RF model was trained in MATLAB (2018b) using the TreeBagger function in the Statistics and Machine Learning Toolbox. Main outcome measures: The performance of the two models was compared using the area under the Receiving Operating Characteristic (ROC) curve (AUROC). Measurements and Main Results: The LR model had an AUC of 0.71 (95% CI 0.64-0.78). The RF model had an AUC of 0.63 (95% CI 0.54-0.71). and an AUC of 0.65 (95% CI 0.56-0.74) after hyperparameter optimization. Conclusion: The RF model is not superior compared to the LR model in predicting the outcome of surgical re-intervention within two years after EA. Machine learning techniques are gaining popularity in development of clinical prediction tools, but they are not necessarily superior to traditional statistical logistic regression techniques. The performance of a model is influenced by the sample size and the number of features, hyperparameter tuning and the linearity of associations. Both techniques should be considered when developing a prediction model.