Materials and Methods
Alate queen (gyne) collection. Young Solenopsis
invicta gynes exhibit distinct rates of sexual and reproductive
maturation depending on social chromosome genotype (Nipitwattanaphon,
Wang, Dijkstra, & Keller, 2013); these gynes do not attempt to leave
their natal nest on mating flights until they are fully sexually mature.
We collected gynes embarking on mating flights in the field to achieve
physiological matching across genotypes, colonies, and forms based on
maturity rather than strict age-matching (Figure 1), although, based on
the colony phenologies and weather conditions at the collection locale
in spring of 2015 in Athens-Clarke Co., Georgia, USA, it is doubtful
that any sampled gynes differed by more than a couple of weeks in age
from one another. Alate (winged) gynes were aspirated from the tops of
mounds and frozen on dry ice in the field in and around Athens-Clarke
Co. on days of mating flights in April 2015. Alate gynes were collected
from polygyne nests at three sites (four nests from 33°55’18”N
83°21’2”W, three nests from 33°54’48”N 83°20’46”W, and one nest from
33°54’36”N 83°20’6”W), and from eight monogyne nests at one site
(33°42’17”N 83°16’53”W). All samples were stored at -80°C pending
further processing. We processed a total of eight gynes from as many
monogyne colonies and three gynes each (one of each genotype) from eight
polygyne colonies.
Sample genotyping. RNAlater -ICE (Thermo Fisher
Scientific, MA) was added to tubes holding the frozen gynes, which were
submerged for >24 hours at -20°C. Gynes from polygyne
colonies then were sorted by size in a Petri dish on top of dry ice. The
smallest gynes were selected as prospective Sb/Sb , medium asSB/Sb , and the largest as SB/SB , based on prior data on
the association of gyne mass with Gp-9 genotype (C. DeHeer et
al., 1999). Legs were removed for DNA extraction and genotype
confirmation, then bodies were stored at -80°C until dissection and RNA
extraction. DNA was extracted using a Puregene DNA isolation kit
(Qiagen, Valencia, CA). To obtain the social chromosome genotype of each
gyne from presumed polygyne colonies, a Gp-9 PCR assay was
performed (Valles & Porter, 2003). The presence of gynes withSB/Sb and Sb/Sb genotypes also confirmed colony social
form of polygyne colonies because only polygyne colonies contain
individuals with the Sb haplotype (for which theGp-9b allele is fully diagnostic). To confirm
social form of the monogyne colonies from which gynes were sampled,
15-20 individuals (mix of workers and alate gynes) were collected from
each colony and the same Gp-9 PCR assay was performed. All gynes
sampled from polygyne colonies were genotyped at 9-13 previously
described polymorphic microsatellite loci (Ascunce, Bouwma, &
Shoemaker, 2009). These microsatellite data revealed that no sampled
individuals were triploid (Krieger, Ross, Chang, & Keller, 1999) and
that no sampled nestmates were siblings (Supplementary Data 1).
Organ dissection. Gynes were dissected under an Olympus
SZ61 stereomicroscope. After making an opening in the cuticle on the
back of the head, a minutin probe was used to free the brain from the
head capsule. The majority of tracheae and glands were removed from each
brain; however, some traces of tracheae and possibly residual gland
material remained. The ovaries were extricated from the hindgut, and any
Malpighian tubules present were removed, along with some excess fat
body. Due to the extensive tracheal intrusion in the ovaries, removing
all traces of tracheae and fat body from the ovaries was not feasible.
Tissues were stored at -20°C in RNAlater -ICE until RNA
extraction.
RNA extraction, library preparation, and sequencing. RNA
was extracted from brains using the RNeasy Micro Kit and from ovaries
using the RNeasy Mini Kit, both with DNase treatment (Qiagen, Valencia,
CA). Extracted total RNA integrity and concentration were evaluated for
every sample on an Agilent 2100 bioanalyzer (Supplementary Data 1).
Illumina sequencing libraries were prepared following the Smart-seq2
protocol (Picelli et al., 2014), which was developed for low input RNA
sequencing applications. Based on bioanalyzer readings, approximately
1.2 ng of total RNA was used to make each brain library and
approximately 3.6 ng of total RNA was used to make each ovary library
(Supplementary Data 1). Samples were barcoded and pooled prior to
sequencing at the Georgia Genomics and Bioinformatics Core (Athens, GA)
on an Illumina NextSeq instrument to produce 75 bp single-end reads. We
sequenced all brain samples on one high-output flow cell and all ovaries
on a second high-output flow cell.
Quality control and read alignment. Most rRNA presumably was
removed during RNA-Seq library preparation by poly-A mRNA isolation in
the Smart-seq2 protocol. Nonetheless, any reads in our data that aligned
to Myrmecia croslandi (Australian bull ant) 18S rRNA, 5.8S rRNA,
or 28S rRNA genes were removed using BLAT (version 3.5). The remaining
reads were trimmed with Trimmomatic version 0.32 (TRAILING:3, LEADING:3,
SLIDINGWINDOW:4:15, MINLEN:36) (Bolger, Lohse, & Usadel, 2014). Reads
were aligned to S. invicta assembly gnG (NCBI accession
GCF_000188075.1, generated from an SB male) using STAR version
2.5.3a (Dobin et al., 2013; Wurm et al., 2011). We utilized previously
published linkage mapping information to determine which scaffolds
mapped to each linkage group as well as to the Sb supergene
(Pracana, Priyam, et al., 2017; Wang et al., 2013). One SB/SBpolygyne ovary sample (104DO) was removed from further analysis due to
excessively low mapping (~0.9% uniquely mapped reads)
(Supplementary Data 1). In addition, we resequenced two libraries that
had lower than expected numbers of sequenced reads (239AB and 1EB).
After alignment, we had between 10 million and 31 million uniquely
mapped reads available for each sample (Supplementary Data 1).
Differential Expression Analysis. Aligned reads were mapped to
the NCBI annotation release 100 RefSeq gene models using the
FeatureCounts function from the RSubread package (Liao, Smyth, & Shi,
2013). Genes were filtered from the analysis if fewer than 8 libraries
had a Counts-per-million (CPM) > 10/9 at that locus (more
than 10 reads aligned to a gene in the majority of libraries). This
decreased the possibility of inferring differential expression from
structured background noise. The data were then analyzed using two
statistical approaches in edgeR (Robinson, McCarthy, & Smyth, 2009) –
a ‘multifactorial model’ and a ‘pairwise model’. The
multifactorial-model parameters included colony identity, social form,
organ type, social chromosome genotype, and a covariance term for organ
and genotype. We used this model to ask broad questions about our data
while making use of all of the information from all of our samples. The
pairwise comparisons simply treated each sample type (e.g., polygyneSB/SB brain, polygyne SB/Sb brain, etc.) as an independent
treatment for analyzing specific differences between two types. For each
approach, genes were considered differentially expressed if they
exhibited an FDR-corrected P-value < 0.01. Each comparison was
visualized using volcano plots with a custom script using the ggplot2 R
package (Villanueva & Chen, 2019).
Overlap of significant differences for various comparisons was
visualized using UpSetR (Conway, Lex, & Gehlenborg, 2017). Next,
candidate gene categories were further refined by classifying genes as
‘ovary-specific,’ ‘brain-specific,’ or ‘Sb/Sb -specific’ if they
exhibited an FDR-corrected P-value < 0.01 in the relevant
comparisons (e.g., ovary SB/SB versus SB/Sb and ovarySB/SB versus Sb/Sb in the ovary-specific category) while
also exhibiting an FDR-corrected P-value > 0.1 in the other
categories (e.g., brain SB/SB versus SB/Sb and brainSB/SB versus Sb/Sb in the ovary-specific category). We
note that genes that exhibit brain- or ovary-specific expression
differences in our study, as well as those that do not, may exhibit
differential expression in unsampled tissues. Visualization of the
genomic location of differentially expressed genes (DEGs) was performed
using KaryoPloteR (Gel & Serra, 2017). Non-random spatial distribution
of DEGs, with respect to higher or lower expression of Sb -bearing
samples, was assessed using a two-sided Wald-Wolfowitz runs test (Wald
& Wolfowitz, 1940) after significant DEGs were converted into binary
representation based on the directionality of differential expression.
Allele-specific Expression Analysis. To perform
allele-specific expression analysis, we generated a high-confidence set
of haplotype-specific SNPs from our RNA-seq data following a previously
described protocol (Auwera et al., 2013). In short, we performed a
2-pass alignment of the reads using STAR (Dobin et al., 2013), formatted
the reads for GATK using Picard (“Picard toolkit,” 2018), then
identified variants using the GATK HaplotypeCaller (Auwera et al.,
2013). We leveraged our homozygous samples (SB/SB andSb/Sb ) to retain only those variants that consistently
differentiated SB and Sb haplotypes. We removed Sbvariants that were found as polymorphisms in the SB population as
well as those that were found in fewer than three Sb/Sb samples.
A total of 3994 SNPs diagnostic for the Sb and SBhaplotypes remained following this procedure (Supplementary Data 2). We
then n-masked the genome for these variants using Bedtools (Quinlan &
Hall, 2010) to eliminate any alignment bias and re-aligned the
heterozygous SB/Sb samples to the masked genome using the 2-pass
method in STAR (Dobin et al., 2013). The GATK ASEReadCounter function
was used to generate a per-SNP count matrix for SB/Sb samples
(Auwera et al., 2013), which were mapped to genes using Bedtools
intersect. Both the SNP-level and gene-level allele-specific expression
counts were next passed to edgeR and fit to a model with variance terms
for colony identity, organ, allele, and a covariance term for organ and
allele effects (McCarthy, Chen, & Smyth, 2012). Genes and SNPs were
considered to display significant allele-specific expression if the
FDR-corrected P < 0.01. Visualization of the genomic location
of SNPs displaying significant allele-specific expression was performed
using KaryoPloteR (Gel & Serra, 2017). The overlap between genes
displaying differential expression and allele-specific expression
results was visualized using UpSetR (Conway et al., 2017).