We also investigate effect of ensemble learning using multiple sequences, which could provide references to choose appropriate sequences for PLDC. In each DA setting, the models using multiple sequences are always more effective than using any single sequence alone. Besides, although ADC or hDWI always leads to the worst classification results, T2 ensembled with one/both of them can explicitly enhance the model’s performance. This finding is consistent with the clinical practice of using mpMRI for PCa diagnosis. Sequences ADC and hDWI are usually considered as secondary references by radiologists. It should be noted that the all-sequence-ensembled (i.e. ensemble of T2, ADC, and hDWI) models show significant predictions in most DA settings. Although ensemble of the three sequences could not lead to the best performance in the second DA setting (i.e. P-x → LC-A), the model of the second DA setting still attains a remarkable AUC of 0.91, which is only about 1% smaller than the highest AUC (0.92). It can be concluded that using more sequences would help multi-cohort MRI harmonization, thus boosting the final classification performance. Moreover, with the same target domain (i.e. either LC-A or LC-B), the CMD²A-Net transferred from P-x attains a higher AUC than transferred from a local cohort domain in each sequence combination. This implies more source samples could enhance the model’s cross-domain knowledge transferability, thus improving the model’s generalization in the target domain. The superior performance also demonstrates CMD²A-Net’s capability of transferring the knowledge of a public dataset to our local cohort domains.
Figure 2 shows coarse lesion detection results of the accurately classified and misclassified examples. Two DA settings (i.e. P-x to LC-A, and P-x to LC-B) were selected as representatives for lesion detection evaluation. Results of the all-sequence-ensembled method are selected as representative for analysis. In the correctly classified examples, Coarse lesion contours could encircle the lesion ground-truth point in all sequences (as shown in Figure 2a ). However, in the unclassified examples, the coarse lesion position could not be precisely detected in most sequences as shown in the third row. In the example of LC-A, the lesion on T2 is correctly detected, but the lesion contours on ADC and hDWI maps are falsely identified. The possible reason is that the coarse lesion masks applied as the training ground truth could not depict the actual lesion contours accurately. Therefore, we can observe that accurate detection on ADC and hDWI also play a role in enhancing the ensembled classification, although lesion detection generally heavily relies on T2 images. In the future, robust weak label processing methods (e.g., deep extreme level set evolution method [36]) are expected to be employed. For the example from LC-B, under-segmentation of the prostate region can be found on the T2 image, which could lead to failure lesion detection. As the prostate regions on ADC and hDWI were transformed using T2, under/over-segmentation of the prostate gland on T2 would deteriorate the lesion detection in the other two sequences. Despite the inaccurate lesion detection on ADC and hDWI, it should be noted that the models with multi-sequences input still outperform the models using T2 alone in lesion classification, accrediting to the re-use of prostate features from ADC and hDWI.

2.4. Comparisons with the State-of-the-art Methods

We compared our model with three state-of-the-art models using AUC, i.e. Resnet50 [37], DANN [38], and Deep Coral [25]. Dataset, P-x, was used as the source domain. Our local cohort datasets, LC-A and LC-B, acted as the target domains. The individual (i.e. T2, ADC, and hDWI) and the ensembled (i.e. T2 + ADC + hDWI) sequences were involved. The other ensembled sequences, T2 + ADC, T2 + hDWI, and ADC + hDWI were not involved here due to their inferior performance as discussed in Section 4.2. Detailed comparison results are summarized in Table 5 .
Table 5. AUC comparisons on malignancy classification (i.e. csPCa or non-csPCa) with the three existing models.