Datasets T2 T2 T2 T2 ADC ADC ADC ADC hDWI hDWI hDWI hDWI
P-x LC-A LC-B LC-C P-x LC-A LC-B LC-C P-x LC-A LC-B LC-C
P-x only 0.91 0.35 0.55 0.65 0.67 0.53 0.42 0.51 0.81 0.41 0.68 0.50
LC-A only N/A 0.61 0.55 0.66 N/A 0.69 0.38 0.53 N/A 0.70 0.57 0.49
LC-B only N/A 0.39 0.61 0.67 N/A 0.52 0.61 0.48 N/A 0.47 0.88 0.59
Joint P-x, LC-A 0.89 0.67 N/A N/A 0.73 0.54 N/A N/A 0.73 0.54 N/A N/A
Joint P-x, LC-B 0.88 N/A 0.59 N/A 0.66 N/A 0.55 N/A 0.76 N/A 0.87 N/A
Joint LC-A, LC-B N/A 0.63 0.53 N/A N/A 0.74 0.61 N/A N/A 0.72 0.91 N/A
Then, we analyzed cross-site heterogeneity on our multi-cohort datasets (P-x, LC-A, LC-B, and LC-C). We aim to verify whether the prior MR[7] image intensity normalization (e.g., Liu, et al. [8] is effective to reduce domain shift, when domain knowledge is not considered. Coarse Mask-guided Network (i.e. CM-Net, in Figure 4) was utilized for cross-site heterogeneity analysis. Here, training a model on an individual dataset is defined as “separate learning approach”, while training a model using a combined dataset from multi-cohort samples is defined as “joint learning approach”. As shown in Table 2, we trained CM-Net using the individual and combined datasets from P-x, LC-A, and LC-B. The three separate models were individually trained in these three domains. They are set as the baselines for comparisons with the joint models. During the testing phase, each separate model was tested on the four datasets. LC-C only acted as the hold-out testing set for domain shift analysis, as its small size (only 29) would cause overfitting in training, and biased prediction in testing. Note that, owing to the limited sample size of local cohorts (74 and 108 cases on LC-A and LC-B, respectively), separate models of LC-A and LC-B were pre-trained on the large-scale dataset P-x (330 cases), and then fine-tuned on the corresponding domain. Such a transfer learning[9] strategy would reduce overfitting caused by data scarcity.  A common preprocessing method, scaled, was employed to normalize the image intensities within [0,1].  
Table 3. Comparisons of AUC using six image preprocessing methods.