Statistical analyses
The three regions differ greatly in climate, topography, and altitude range (Table 1). Furthermore, which regions are occupied by a species is the outcome of many factors. Therefore, in the statistical analyses we treat region as a synoptic trait that encompasses many unspecified environmental factors. The three regions are contiguous, and we assume that, over the evolutionary time scale represented by the diversity of the 9,370 species in the data set, there is little constraint caused by dispersal; this assumption is confirmed by the analyses (see ”Results”).
To partition the relative contributions of climate region, plant growth form, and phylogenetic conservatism on the variation in fruit type, we used the R package “rr2” (Ives & Li 2018; Ives 2019) to calculate partial R2s for phylogenetic logistic regression models fit using the phyloglm function in the R package ”phylolm” (Tung Ho & Ané 2014). The phyloglm function allows for the calculation of only one type of partial R2 in the rr2 package, R2lik, that is based on the likelihood. R2likis the appropriate R2 when comparing models according to the statistical significance of differences between them. The partial R2lik for each factor was calculated by comparing the full model with reduced models in which a given factor was removed, and measuring the consequent reduction in explained variance. The full model is a logistic regression model with fruit type (fleshy and dry) as the dependent variable, climate region (tropic, subtropical, and temperate) and growth form (woody and herbaceous) as independent variables, and phylogeny as covariances in the residual variation.
To give more information about the pattern of phylogenetic conservatism, we performed the analyses not only with the full time-scaled phylogeny, but also a phylogeny in which relatively recent phylogenetic diversification was removed. Specifically, we created phylogenies with reduced ”recent” phylogenetic structure by collapsing nodes above a given threshold together to form a ”star”. For example, for a threshold of 0.67, any node above the 66.7% (2/3) mark on the phylogeny was collapsed so that evolution above this threshold was assumed to occur independently among species (Fig. 1). The specific threshold of 0.67 corresponds roughly to the taxonomic scale of families. To determine the importance phylogenetic patterns across the entire depth of the phylogeny, we performed the same analysis using thresholds from 1 down to 0.2. Re-analyzing the data across this range of thresholds addresses the relative importance of ”recent” versus ”ancient” species relationships in explaining phylogenetic conservatism.
The logistic regression showed that fruit type was poorly explained by climate region and growth form, but the residual variation was well-explained by the phylogeny. Because we had expected climate region and growth form to be good predictors of fruit type, we investigated all three variables – fruit type, climate region, and growth form – separately to determine whether they all show similar patterns of phylogenetic conservatism. If climate region and/or growth form show less phylogenetic conservatism than fruit type, this would suggest that climate region and/or growth form are phylogenetically more labile and therefore might not be expected to predict fruit type. If climate region and/or growth form show the same phylogenetic conservatism as fruit type, then this, in combination with the low explanatory power they have of fruit type, would imply that their evolution is uncorrelated to fruit type.