In this work, we develop Coarse Mask-guided Deep Domain Adaptation
Network (CMD²A-Net) for both coarse prostate lesions detection and
lesion malignancy classification. Besides,
we also extend the proposed
network to an open-sourced system. This executable end-to-end system
takes mpMRI sequences as input, and outputs coarse lesion contours as
well as lesion malignancy. The system can also be downloaded online. Our
work contributions can be summarized below:
- Development of a deep-learning-based system for fully automated prostate lesion assessment. Our end-to-end system is dedicated to PLDC
on multi-cohort mpMRI without the need of prior manual processing on
mpMRI sequences.
- Design of a UDA model (i.e., CMD²A-Net) capable of leveraging
cross-site representation transfer to realize accurate PLDC without
requiring target labels, where weakly-supervised coarse lesion
segmentation modules are incorporated, in order to extract informative
lesion features, thus facilitating feature alignment between domains.
- Experimental evaluation of CMD²A-Net on one public dataset (i.e.,
PROSTATEx [12]) and three local cohort datasets,
including lesion assessments with various mpMRI sequence inputs,
comparisons with state-of-the-art models, as well as ablation study.
The capability of transferring knowledge from PROSTATEx to our
small-scale local cohort datasets is demonstrated over the
state-of-the-art models.
Related Work
CNNs have been proved effective and widely applied for mpMRI-based PCa
classification with promising performance. Wang and Wang[13]a attempted to explore optimal mpMRI sequence
combinations as the CNN’s input, and their model achieved an AUC of
0.95, which was reported to outperform all models in the PROSTATEx
Challenge. Rather than PCa classification only, Kiraly, Abi Nader,
Tuysuzoglu, Grimm, Kiefer, El-Zehiry and Kamen[27] developed a model with an encoder-decoder
architecture to detect prostate lesions and simultaneously classify the
lesion malignancy. However, these studies required manually-cropped
regions of prostate, which would be time-consuming and expensive[22a, 28].
End-to-end PLDC frameworks have also been investigated, with the aim to
avoid the need for manual prostate segmentation. Yang, Liu, Wang, Yang,
Le Min, Wang and Cheng [2] incorporated CNN for
automatic segmentation in advance to the PLDC. Insufficient prostate
image features extracted by the shallow network (i.e. in five-layer)
could deteriorate much the overall segmentation accuracy. Later, Wang,
Liu, Cheng, Wang, Yang and Cheng [29] proposed a
deeper prostate segmentation model capable of detecting more complex
features. Apart from improving the segmentation performance, fusing
spatial features using 3D CNNs is also another means to enhance the PCa
classification accuracy. Mehta, Antonelli, Ahmed, Emberton, Punwani and
Ourselin [30] employed a patient-level 3D model
for binary classification using volumetric mpMRI, achieving an AUC of
0.79 and 0.86, respectively, on their local cohort dataset and
PROSTATEx. However, only single-cohort datasets were used to evaluate
the model. Domain shift would occur when it is directly applied to an
unseen cohort [17-18]. Provided with very few
studies (e.g., Mehta, Antonelli, Ahmed, Emberton, Punwani and Ourselin[30]) mpMRI sequences from multiple cohorts, they
could just directly combine the heterogeneous images, giving rise of
samples sufficient for model training, but inevitably ignoring data
source heterogeneity. It would be prone to severe domain shift, thus
biasing predictions by particular cohorts.
Very recently, there are many research attempts in investigating DA
approaches to alleviate inter-site distributional variability, among
which UDA methods demonstrated their advantages in exploiting unlabeled
target samples [20]. Such UDA methods can be
categorized into two groups: (1) image translation and (2)feature alignment approaches. The former one performs image
appearance alignment [17, 22]. The resultant
models translate images across domains using GAN-based networks[23]. However, texture similarity between the
image of synthesized target and the source would be crucial for the PLDC
problem. The DA process would fail with insufficient texture similarity,
particularly found in the generated lesion area[22c]. Besides, lesions could be missed during the
translation process due to various transferability among image regions,
thus worsening the DA process [31]. Moreover, the
GAN models would distort the non-lesion region’s appearance, further
causing unreliable lesion assessment results [24].
By using feature alignment approaches, domain-invariant features are
extracted to reduce domain shift [26]. A common
way is to minimize distribution similarity (e.g. second-order
correlation [25]) between domains using Siamese
network architecture. Adversarial learning [26a]can also align features by enforcing the cross-domain features
indistinguishable using a domain classifier. For instance, Wang, Feng,
Zhang, Wang, Lv and Yi [14] developed a GAN-based
method to learn domain-invariant features on mammographic images
acquired for breast cancer screening. However, these models were usually
trained with the entire images, treating all voxels equally[26b, 28]. Previous works [24,
26b] revealed that not all image regions can facilitate knowledge
transfer across domains. Roughly aligning the features in the whole
image set would introduce irrelevant knowledge, resulting in ineffective
DA. It is hypothesized that the background regions on mpMRI sequences,
such as regions outside the prostate gland, would not attribute to DA
well in our PLDC problem. To our knowledge, only few works reported PCa
classification using multi-site ultrasound images[32], histopathology images[33], or T2 image slices only[13b].
2. Results and Discussion
2.1. Datasets
Table 1. Characteristics of the five MRI datasets for prostate
segmentation and PLDC.