Materials & Methods

Tested termite and cockroach species

In a previous study, chemical profiles analyzed among the same taxa where found to be solely discriminable on the species level, rather than on the colony (termites) or population (cockroaches) level (Golian et al. 2022). Therefore, we restricted ourselves to two respective laboratory colonies per termite species and two respective laboratory populations per cockroach species. All species used in this study were maintained in the Federal Institute of Materials Research and Testing (BAM), Berlin. Termite colonies of R. flavipes, C. formosanus, K. flavicollis and N. castaneus were kept in a darkened room at 26°C and 84% humidity, and colonies of M. darwiniensis  were maintained at 28°C and 83% humidity. All colonies were fed regularly with pre-decayed birch wood. The cockroaches B. germanica  and B. orientalis  were maintained in mixed open rearing boxes in 12-hour light/dark cycles at 26 °C and 50 % humidity, from the day of egg-laying until disposal of older adults. Cockroaches were reared on a mixture of 77.0 % dog biscuit powder, 19.2 % oat flakes, 3.8 % brewer’s yeast and supplied with water ad libitum and weekly with apple and carrot slices. All cockroaches and termites were freeze-killed and stored at -20 °C until further analysis.

CHC extraction and analysis

To yield comparable amounts of extracts between our cockroaches and termites that vary largely in size, we had to adjust extraction volumes and pool smaller individuals. For this, we used 300 and 3000 µl MS pure hexane (UniSolv, Darmstadt, Germany) on single B. germanica andB. orientalis individuals, respectively, and 100 µl on pools of 3 individuals per termite species for extraction. Extraction procedures were then equalized to ensure comparability. Extractions were performed in glass vials (20 ml for cockroaches and 2 ml for termites, Agilent, Santa Clara, California, USA) on an orbital shaker (IKA KS 130 Basic, Staufen, Germany) for 10 minutes. Afterwards, the extract was evaporated under a constant stream of gaseous carbon dioxide. Then, it was resuspended in 5 μl in a hexane solution containing 7,5 ng/μl dodecane (C12) as an internal standard. 3 μl of the resuspended extract were then injected into a gas-chromatograph coupled with a tandem mass spectrometer (GC-MS/MS) (GC: 7890B, Triple Quad: 7010B; Agilent Technologies, Waldbronn, Germany) equipped with a fused silica column (DB-5MS ultra inert; 30 m x 250 μm x 0.25 μm; Agilent J&W GC columns, Santa Clara, California, USA) in splitless mode at a temperature of 300 °C with helium used as a carrier gas under constant flow of 2.25 ml/min. The temperature program started at 60 °C held for 5 min, increasing 20 °C per minute up to 200 °C and then increasing 3 °C per minute to the final temperature of 325 °C, held for 5 min.
CHC peak detection, integration, quantification and identification were all carried out with Quantitative Analysis MassHunter Workstation Software (Version B.09.00 / Build 9.0.647.0, Agilent Technologies, Santa Clara, California, USA). The pre-defined integrator Agile 2 was used for the peak integration algorithm to allow for maximum flexibility, quantification was carried out over total ion chromatograms (TIC). All peaks were then additionally checked for correct integration and quantification, and, where necessary, re-integrated manually. CHC compound identification was then carried out based on their characteristic diagnostic ions and retention indices. Analysis was focused exclusively on non-polar CHC compounds due to their repeatedly demonstrated involvement in chemical signaling in both solitary and eusocial Blattodea (e.g. Lihoreau and Rivault 2008; Hoffmann et al. 2014; Hamilton et al. 2019). The obtained values for the absolute peak area integrals were standardized by dividing them through the total of all CHC peak area integrals per sample, generating relative proportions for all CHC compounds. These proportions were then summarized for the individual CHC compound classes. Sample sizes for our individual species were B. germanica : 9, B. orientalis : 11, C. formosanus : 5, K. flavicollis : 4, M. darwiniensis : 4,N. castaneus : 5 and R. flavipes : 5. We focused on termite workers to obtain the general colony-specific chemical profiles as the vast majority of individuals constituting the respective colonies are workers (Korb 2007; Roisin and Korb 2010; Korb and Thorne 2017). Moreover, we attempted to render our study comparable to other studies on chemical complexity in eusocial species also focusing exclusively on profiles obtained from the worker caste (Martin and Drijfhout 2009; Kather and Martin 2015). Similarly, we pooled the sexes of both respective cockroach species to focus on representative species-specific chemical profiles, with less emphasis on the more subtle sex-specific differences (Pei et al. 2021).

Comparison of chemical and phylogenetic divergence

To standardize the absolute peak area values for chemical clustering, the normalization method of the function “decostand” of the community ecology R package “vegan” was used (Oksanen et al. 2008), based on the following formula:
\begin{equation} T_{x,y}=\ \frac{P_{x,y}}{\sqrt{\sum P_{y^{2}}}}\nonumber \\ \end{equation}
Tx,y refers to the transformed peak area x of individual y, Px,y to the absolute peak area x of individual y and Σ Py2 to the squared sums of all absolute peak areas of individual y. This widely applied method for normalizing ecological data was chosen to make the peak areas comparable between our groups, to highlight the relative peak area differences and to correct for size-dependent variation. Agglomerative hierarchical cluster analysis (“U nweighted P air-G roupM ethod with A rithmetic means”, i.e. UPGMA) was performed with the R package “ape” (Paradis et al. 2004), based on average chemical Manhattan distances reflecting the median CHC divergence separating the different cockroach and termite species. The formula for calculating Manhattan distances is as follows:
\begin{equation} \sum{[Y_{j}-Y_{k}]}\nonumber \\ \end{equation}
The actual difference between two data points, in this case Yj and Yk, is used based on the total amount of CHC variation between the species. In contrast with Euclidean distance where squared differences are used, the Manhattan distance is less prone to be dominated by single large differences between the compared component. It has thus been suggested that for multidimensional phenotypes such as CHC profiles, the Manhattan distance metric is the most ecologically meaningful (Oksanen 2009). The molecular phylogeny was obtained and adapted from the latest published Blattodea phylogeny in He et al. (2021). A Mantel test (Mantel 1967), conducted with the R package “ade4” (Dray and Dufour 2007), compared the molecular distances based on the published Blattodea phylogeny with the average Manhattan CHC divergence. The Mantel test was performed five times with 9999 permutations for each single test, the average probability is presented.

Analysis of transcriptomic gene counts

We retrieved translated whole genome transcriptome sequences based on whole-body RNA extractions for each of the seven investigated Blattodea species as described in He et al. (2021). Total RNA was isolated from individuals for all species. Due to the large body size, adult cockroaches were cut into 4-6 parts for separate extraction, followed by re-pooling. For the extraction itself, samples were suspended in pre-cooled Trizol (Thermo Fisher Scientific) and homogenized twice at 10 s at 2 M/s with a 5-mm steal bead (Qiagen) using a tissue homogenizer (MP Biomedicals). Total RNA was isolated with a chloroform extraction, followed by isopropanol precipitation, according to instructions from Trizol. Extracted total RNA was dissolved in RNA storage solution (Ambion) and then incubated with 2 units of TurboDNase (Ambion) for 30 min at 37 °C, followed by purification with an RNAeasy Mini kit (Qiagen) according to manufacturer’s instructions. Quantity and quality of RNA were determined by Qubit and Bioanalyzer 2100, respectively. Following pooling described in sample collection part, total RNA was used to construct barcoded cDNA libraries using a NEXTflex™ Rapid Directional mRNA-seq kit (Bioo Scientific). Briefly, mRNA was enriched using poly-A beads from total RNA and subsequently fragmentated. First and second-strand cDNA was synthesized and barcoded with NEXTflex™RNA-seq Barcode Adapters. The libraries were sequenced on an Illumina NextSeq500/550 platform at Berlin Center for Genomics in Biodiversity Research (BeGenDiv). We obtained orthologous sequences of enzymes which have been selected according to a demonstrated function in CHC biosynthesis via targeted knockdown studies (summarized in Holze et al. 2021) from NCBI. In order to estimate numbers of transcripts orthologous to these enzymes, we first created Hidden Markov Models (HMMs) for each of their protein sequences. For this, each of the query sequences was blasted against a database of proteomes from 17 insect genomes (Acyrthosiphon pisum, Bemisia tabaci, Blattella germanica, Clitarchus hookeri, Coptotermes formosanus, Cryptotermes secundus, Diploptera punctata, Drosophila melanogaster, Glossina morsitans, Locusta migratoria, Macrotermes natalensis, Medauroidea extradentata, Musca domestica, Periplaneta americana, Rhopalosiphum maidis, Stomoxys calcitrans, Zootermopsis nevadensis ) with blastp (version 2.7.1+, Camacho et al. 2009). Blast output was filtered to contain only hits with an e-value < 1e-10 and a minimum sequence identity of 50%. For each query gene, protein sequences of all significant hits were retrieved from the protein database and aligned with PRANK (version v.170427; Löytynoja 2014) at default settings. Hidden Markov Models (HMM) were created for each alignment using hmmbuild (version 3.1b2; Wheeler and Eddy 2013) at default settings. These HMMs were then used to search the proteomes of our seven focal Blattodea species using hmmsearch with a maximum e-value of 1e-5. Output of these hmm-searches were then filtered to contain only hits with a score of at least 100. If a transcript appeared in multiple lists, it was attributed to the query HMM for which it received the highest score. Finally, to verify these results, we blasted all transcript sequences against the swissprot database (accessed November 2020) with blastp (version 2.7.1+; Altschul et al. 1990). Any transcripts without clear orthology to the gene of interest were then excluded from the transcript counts. We performed used a generalized linear model (GLM) with Poisson family distribution to compare the variation in transcript counts among the levels of social complexity in our tested species and visualized the results with a heatmap, utilizing the function “heatmap” provided by the R package “stats”. Furthermore, we compared total gene transcript counts with the total number of detected CHC compounds per species with a χ2 (chi-square) test.