Gray shading indicates nonsignificant gene expression effects. Candidate genes discussed in text are highlighted with bold underlining.1 GeneID and description taken from the NCBI RefSeq annotation release 100.2 Log2-Fold change (log2FC) values are included with the corresponding statistical significance (* FDR p < 0.05, ** FDR p < 0.01). Positive values indicate elevated expression in SB/SB samples relative to samples withSb-bearing genotypes, negative values indicate elevated expression for Sb-bearing genotypes.3 Presence in the supergene is based on the previously published linkage mapping data (Pracana, Priyam, et al., 2017; Wang et al., 2013).4 Additional support indicates whether each gene has been reported as differentially expressed in previous studies.a indicates (Wang et al., 2008), b indicates (Wang et al., 2013), c indicates (Nipitwattanaphon et al., 2013),d indicates (Pracana, Levantis, et al., 2017), and eindicates haplotype-specific gene duplications identified in (Fontana et al., 2019). Criteria for inclusion of genes in each category were specific, conservative P-value thresholds conceived in our study (see Methods for details).5 Details on uncharacterized candidate loci expression are reported in Supplementary Data 2
SUPPLEMENTARY DATA
Supplementary Data 1: RNA quality, microsatellite-based relatedness assay, sequence quality control and alignment statistics for each of our samples.
Supplementary Data 2: Compendium of differential expression and allele-specific expression for all genes.
Supplementary Data 3: GO term enrichment for differential expression analyses (P < 0.05).
Supplementary Data 4: Overlap and correlations for differentially expressed genes when using the Wurm et al. 2011 and Yan et al. 2020 genome assemblies.
SUPPLEMENTARY METHODS
Polar Dominance Analysis. To quantify the dominance patterns of the various differentially expressed genes, we needed to leverage information from both the SB/SB vs. SB/Sb comparison as well as the SB/SB vs. Sb/Sb comparison at the same time. Additionally, we needed to check all genes effected by the presence of the supergene, not simply the genes that were differentially expressed in either comparison. To this end we used the full glm approach querying for any genes whose expression was significantly affected by the presence of the supergene (FDR < 0.01). We then computed the angle between the log2-fold change in the SB/SB vs. SB/Sbcomparison and the log2-fold change in the SB/SB vs. Sb/Sbcomparison (Figure S3). This angle is descriptive of the dominance pattern of that particular gene.
Odorant Binding Protein Differential Expression . Leveraging the gene models from previously published work (Pracana, Levantis, et al., 2017), we tested for differential expression amongst the most complete set of odorant binding proteins (OBPs) available in the system. We re-mapped our aligned reads to the OBP gene models using RSubread’s FeatureCounts function (Liao et al., 2013) and then computed differential expression using edgeR as described in the Methods (McCarthy et al., 2012). We then plotted the log2-fold change and FDR-corrected p-value using the pheatmap package in R. GP-9 is described here as SiOBP3.
GO term enrichment analysis. GO terms were determined for each gene using BLAST2GO to search for homologous sequences in theDrosophila melanogaster genome (version r6.20). We then used topGO with default parameters to compute enriched GO terms amongst significantly differentially expressed genes (FDR < 0.01) compared against a background of genes that passed our coverage filtration. Terms with a P < 0.05 were deemed significant.