Prostate region segmentation

Our models (i.e. Mask-RCNN model, CM-Net, and CMD²A-Net) were trained using a GeForce GTX 1080 Ti GPU (Nvidia, California, USA) with API Keras [45]. For the Mask-RCNN model training, data augmentation with random rotation was applied on the 646 T2 image slices on I2CVB. All the slices were split into training, validation, and testing sets in the ratio of 7:2:1. The input shape of Mask R-CNN was set to 512 × 512 pixels. Adam optimizer was applied with a learning rate of 10-3. The batch size was set to 4 and the total epoch was 200. During the training process, the model with the highest dice coefficient score on the validation set was retained. For CM-Net and CMD²A-Net training, the prostate regions from P-x, LC-A, and LC-B were scaled to 224 × 224 pixels. Random rotation of {±3°, ±6°, ±9°, ±12°, ±15°} was applied for data augmentation. Adam optimizer was chosen, and its learning rate was set to 10-5. The batch size was set as 2. In the training process of CM-Net, due to the limited sample size, all the slices were split into training and testing sets in the ratio of 4:1 using the hold-out method. The segmentation loss was optimized first to accelerate model convergence, and CM-Net with the pre-trained coarse segmentation module was further trained. In terms of CMD²A-Net, we initialized its both branches first using the weight of pre-trained CM-Net, in order to facilitate its convergence. To be specific, we trained both the coarse segmentation module and classifier of CM-Net first, with the combined samples from both domains. Then, we optimized the total loss of CMD²A-Net with labeled source samples and unlabeled target samples. By co-training all the modules, the model with the highest accuracy was saved for malignancy evaluation in the target domain.