Microbiome diversity and predictive functional profiling
Three α diversity metrics were calculated: the Abundance coverage estimator (ACE) to assess ASV richness, the Inverse Simpson index (ISI) to assess ASV evenness and Faith’s phylogenetic diversity (FPD) to investigate phylogenetic richness. In order to evaluate differences in microbial α diversity between larvae raised on different host plants and/or between sites, we used two-way ANOVAs which for F1 larvae included Site (BP: Basin Plat, M: Manapany) as a random factor and Host Plant (C. grandis, C. sativus, S. melongena ) as a fixed orthogonal factor and for F0adults included Site (BP: Basin Plat, M: Manapany) as a random factor and Parental Goup (group 1, 2, 3) as a random factor nested in Site. ANOVA was implemented using the GAD package . Count data from which diversity metrics were calculated were not normalised as all rarefaction plots reached a plateau (S1). To ensure homoscedasticity, a log transformation was applied to the ISI and a fourth root transformation was applied to the ACE and FPD. Cochran’s C tests were used to test for homogeneity of variances with the GAD package . Pairwise comparisons were done by using a F test with Holm correction for multiple comparisons using the phia package .
Before calculating β diversities, we first removed all ASV’s that occurred in only one sample and we normalized counts by transforming them into proportions to represent community structure .
Generalized Unifrac distances using the d5 matrix and Unweighted Unifrac distances were calculated as a β diversity metric . As Unifrac distances take into account the phylogenetic relationships, we constructed a midpoint rooted maximum likelihood tree of the bacterial relationships using a general time reversible substitution model in the program Fasttree . Bacterial 16S sequences were aligned with the DECIPHER algorithm .
Differences in microbiome β diversity between larvae raised on different host plants and/or between sites were tested using a two-way Permutational Analysis of Variance (PERMANOVA, ) with Site (BP: Basin Plat, M: Manapany) as a random factor and Host Plant (C. grandis, C. sativus, S. melongena ) as a fixed orthogonal factor.. The False Discovery Rate (FDR) correction with experiment-wise p < 0.05 was used to correct for multiple testing. Differences between larvae raised on different host plants were visualized with a Principal Coordinate Analysis (PCoA) and 95% confidence ellipses were drawn using the ggplot2 package .
To test for differential abundance of microbial genera (i.e. genera with relatively more sequences assigned to them) among larvae raised on different host plants and from different sites, we used ALDEx2 . ASVs that could not be classified were assigned to distinct, unidentified genera. Genera that showed differential abundance between two treatments with an effect size difference between 1 and -1 were filtered out to reduce the false positive rate . Significance was assessed by both the Welch t test and the Wilcoxon rank sum test followed by FDR correction with experiment-wise p < 0.05 as FDR is better suited for exploratory analyses .
We predicted the possible metabolic functions of the microbiomes by applying Tax4Fun2 to the 16S rRNA sequences . We used the standard 16S rRNA sequence reference dataset Ref100NR of Tax4Fun2 for the functional prediction with a cut-off of 97% sequence similarity threshold to reference genomes. The predicted KEGG (Kyoto Encyclopedia of Genes and Genomes) orthology groups were further cleaned by removing all KEGG groups that are unrelated to metabolic pathways involved in bacterial metabolism/host-microbe symbiosis (e.g. removal of pathways related to human diseases). Functional profiles were visualized using heatmaps with the pheatmap package .