Genes Identified as Key Predictors
There were 18 genes (predictors) that were shared between at least two approaches (see Figure 4). The largest overlap was observed between RFE and SVM (16 genes) while the overlap between SVM and GLMNET was smaller (4 genes). No overlap was detected between RFE and GLMNET. The Jaccard Index between SVM and RFE was the most significant (0.739), while between SVM-GLMNET and GLMNET-RFE it was 0.111 and 0.052, respectively. Similarly, the odds ratio indicated strong association between SVM and RFE (10265.036, p < 0.001). Overall, elements in common corresponded mainly to protein coding genes, with the exception of “GB40714” which corresponds to non-coding RNA. We were able to retrieve functional information for most of the genes from annotations of the honeybee genome (see Table 2), with the exception of “GB50940” and” GB45448”, where annotations were available only for closely related insects (Apis dorsata and Apis cerana , respectively), and “GB54617” that we could not find any information for.