Materials and Methods
To determine whether crossing success was impacted by floral similarity, we quantified floral used in subsequent crossing trials. We genereated data from five flowers per species for the following traits: length and width of the corolla tube, throat and lobes, peduncle thickness, style length, and ovary length. To determine whether crossing success was influenced by vegetative similarity (vs. floral similarity, above) we additionally quantified vegetative phenotypic divergence for these species based on five leaves per species for the following traits: leaf length, length width, petiole length, number of secondary veins, leaf apex angle and leaf base angle. We used an Ocean Optics JAZ Spectrometer to assess floral color differences following McCarthy et al. (2017). Floral reflectance was measured three times per representative corolla at a 45˚ angle. Resulting curves were averaged and then compared across species. Overlapping spectra suggested five clear floral color bins based on curve shape, reflectance wavelength, and median peak height: purple, red, pink, yellow/green, or white (Supplementary Appendix).
To quantify the potential for hybridization, we attempted inter-specific crosses for 16 species of Ruellia (Fig. 2) growing in controlled environment glasshouses at University of Colorado. These species were selected because they derived from the full geographical Neotropical range of Ruellia , with some occurring regularly in sympatry and others not. Because not all species flower at the same time, we were able to attempt crosses between a total of 33 pairs of species, in both directions. We focus on these pairwise comparisons when estimating drivers of crossing success, including floral similarity and geographical range overlap.
Hand pollinations. Hand pollinations were conducted on fresh, fully anthetic flowers by brushing mature, pollen-coated anthers against receptive stigmas (protocol adapted from Long 1966). This approach mirrors the direct transfer of pollen by animal pollinators in natural environments, which characterizes all species of Ruellia . Prior to pollinations, pollen grains were assessed visually under 10x handlens magnification for maturity, which is correlated to anther dehiscence inRuellia . To ensure pollen grain viability, one of the four anthers produced by each species was removed and inspected using the lactophenol-aniline blue stain protocol (Maneval 1936). Stigmas were assumed to be receptive at the time of pollen maturity. For each cross, we mimicked normal pollen load by estimating the average mass produced by anthers of the maternal plant and then adjusting the dosage of pollen donated by the paternal plant accordingly. All crosses consisted of 100% interspecific pollen. Pollinations were conducted between 09:00–17:00. Immediately following hand pollination, receptive flowers were marked using a colored thread system to track multiple crosses on a single individual. Threads were tied loosely but securely around floral peduncles. A small pilot study conducted on flowers and leaves of six species prior to implementation of the above tracking method indicated that loose threading neither caused nor hastened tissue senescence over a two-week period. Following visual inspection of seeds resulting from successful crosses, one to several seeds per fruit were germinated to further confirm cross success. We additionally attempted to germinate seeds from crosses deemed to be unsuccessful based on visual assessment, and none germinated.
All crosses were conducted carefully in a controlled environment in a manner that emulates direct pollen transfer by animal pollinators. Crosses were conducted reciprocally, alternating the donor/recipient status in each cross (n=66 combinations in total for 33 species pairs). The total number of attempted crosses for each combination varied from 2 to 50, with 88% of all species pair combinations being attempted at least 10 times. Crosses were monitored daily until they were determined to either fail or succeed. Crosses that failed to form fruits were treated as failed crosses. Crosses that formed fruits but yielded immature and/or non-viable seeds indicate embryo failure and were treated as failed crosses. Fruits that yielded one or more mature, viable seeds based on visual inspection followed by subsequent germination trials were treated as successful crosses.
Molecular Methods. To account for potential effects of genetic (i.e., phylogenetic) distances between species pairs, we employed the matrix from Tripp and McDade (2014a), which was constructed using three chloroplast markers plus the nuclear ITS+5.8S. We pruned this matrix to contain only taxa relevant to the present study (Fig. 2). The new matrix was aligned using PhyDE (Müller et al. 2016) then analyzed using maximum likelihood imlemented in RAxML v8.2 (Stamatakis 2008). We then constructed a temporally calibrated molecular phylogeny using BEAST v1.82 (Drummond et al. 2012), with three fossil constraints (Supplementary Table 1) derived from Tripp and McDade (2014b), to assess temporal divergence between species pairs. Divergence time estimation methods followed Tripp and McDade (2014b).
Statistical Analyses. To formally test for reproductive character displacement in sympatric species pairs, we used a modified ANOSIM (analysis of similarities) approach (Clark 1993). First, following Coyne and Orr (1989) and Moyle et al. (2004), we classified a given species pair as sympatric if the two species overlap in some portion of their ranges. Co-occurrence was determined through collection notes and localities of herbarium specimens and the extensive field data generated by the first author and taxonomic expert on the genus. We quantified overall reproductive character similarity as the mean Euclidean distance between species in a multivariate decomposition of floral trait space, derived from a principal component analysis of the correlation matrix of the nine quantified floral characters. We also quantified the Euclidean distance between species for each individual floral character. Measures of Euclidean distance (or difference) between species for overall leaf form and individual leaf characters were calculated in the same way.
Our modified ANOSIM approach consists of ranking in decreasing order the Euclidean distances between all species pairs for a given character and then calculating how different are sympatric species pairs, compared to allopatric species pairs, for mean observed ranks. Specifically:
\begin{equation} R_{\text{anosim}}=\ \frac{{\overset{\overline{}}{r}}_{s}-{\overset{\overline{}}{r}}_{a}}{\frac{n*(n-1)}{4}}\nonumber \\ \end{equation}
Where \({\overset{\overline{}}{r}}_{s}\) equals the mean rank of distances between sympatric species and\({\overset{\overline{}}{r}}_{a}\) equals the mean rank of distances between allopatric species. The Ranosim statistic varies from 1 to -1. Values of 0 would indicate that allopatric and sympatric species pairs are no more different from each other than expected by chance. A value of 1 would indicate that sympatric species pairs are always more different in floral form for a given floral character than allopatric species pairs, while a value of -1 would indicate that allopatric species pairs are always more different. To assess whether these differences between sympatric and allopatric species pairs are significantly greater than expected by chance, we used a permutation approach where we shuffled the rows and columns of the dissimilarity matrix for a given character and obtained null expectations for the R value, given the pairwise values being considered (mimicking the same matrix permutation used in standard ANOSIM). This controls for non-independence of data points involving the same species when assessing significance. For a one-tailed test of the hypothesis that sympatric species pairs will diverge significantly more for a given trait than allopatric species pairs, we determined if the observed R statistic was greater than that in 95% of the permutations.
To test if sympatric species pairs are more likely to differ in flower color than allopatric species pairs, we conducted an initial chi-squared analysis to assay whether these two categories of species pairs (sympatric vs. allopatric) had different ratios of species pairs with the same versus different flower colors. As this initial test showed no difference (X2 = 0.01, p = 1), we did not pursue additional analyses that would have controlled for non-independence of data points.
In order to assess drivers of inter-specific crossing success, we used a generalized linear mixed model (GLMM) framework to assess how geographical range overlap and/or similarity in floral shape and color and similarity in leaf shape impacted the success of interspecific crosses. The response variable was the binomially distributed number of successes and failures for each attempted cross. We included donor and recipient species identities as random effects to control for non-independence of crosses involving the same species and because both donor and recipient species identities have significant effects on crossing success (likelihood ratio tests of binomial GLM with species identify as fixed effect versus null model, for donor: Χ2 = 71.2, p < 0.001; and recipient: Χ2 = 95.2, p < 0.001). There was no relationship between crossing success in one direction versus the other (Supplementary Fig. 1; r = 0.06, p = 0.751), and including individual species pairs as a random effect did not improve our statistical models or change estimates of fixed effects. We also included the genetic distance between species as a fixed effect in analyses to control for this additional potential driver of crossing success. We first compared the performance of models with a single fixed effect (and the random effects) to models with only random effects using likelihood ratio tests. We then constructed a full model with all fixed and random effects and compared this full model to sub-models where each fixed effect was dropped in turn, again using likelihood ratio tests. The formula for the full model is:
bin(number of crosses, probability of success) ~ flower colour similarity + floral shape similarity + leaf shape similarity + genetic distance + allopatry vs. sympatry + (1 | Recipient Species Identity) + (1 | Donor Species Identity)
We tested for model overdispersion using a chi-squared test with the residual deviance and degrees of freedom. We did not attempt to test for interactions between our fixed effects due to limited sample size.
While our statistical approach accounts for non-independence of data points due to the same species being used in multiple crosses and to variation in phylogenetic relatedness of species (following Tobias et al. 2014), and while also correctly modeling our binomially distributed crossing success data, it is not identical to ‘phylogenetically corrected’ approaches used in previous studies that tested the effect of sympatry vs. allopatry on reproductive isolation. In order to ensure comparability with previous studies, we conducted an additional statistical test following procedures used by Coyne and Orr (1989) and Moyle et al. (2004). Specifically, we averaged the proportion of successful crosses for all pairs of species that span a given node in our phylogeny to yield a single estimate of crossing success for each node in the phylogeny. Four of the nodes in our phylogeny were not spanned by any species pair in our study and were omitted from further analysis. Seven nodes in the phylogeny have only allopatric species pairs spanning them, while four nodes have sympatric species spanning them. We compared the mean crossing success values for nodes with only allopatric species pairs to that for nodes spanned by sympatric species pairs using a one-tailed non-parametric Wilcoxon test.