Appendix B: Power analyses and Monte Carlo simulations
To support our claim that samples sizes are sufficient, for detecting
differences between the groups of interest (i.e., SSI radiation, SSI
generalist-only, and Caribbean), we include the results of power
analyses, which calculate the required minimum sample size across the
range of effect sizes (eta2: 0.0002 - 0.12), number of individuals per
population (n=7-21), and correlations of trait estimates for individuals
observed in our dataset (corr: 0.00046- 0.50). From these iterative
power analyses, we were able to determine for which combination of our
parameters we would have sufficient sample sizes to detect differences
between groups with 80% power. We compared these cut offs to the
observed values for each trait and population (Figure B1). Ultimately,
we found that we have enough samples to detect differences between
groups for all traits except for ascending process length, orbit
diameter, and cranial height.
We also performed Monte Carlo simulations to investigate at which sample
sizes estimates of variation stabilize based on observed means and
standard deviations of traits from our dataset. To perform these
simulations, we first calculated the average trait values (range: -0.71
- 0.42) and standard deviations (range: 0.01-0.8) for each trait across
the potential groups of our data set (19 groups representing unique
species per population, plus the additional three parameter estimates
associated with Caribbean, SSI generalist-only, and SSI radiating
groups). We used these observed means and standard deviations to
generate normal distributions to represent hypothetical populations to
sample from. We resampled from these distributions using sample sizes
from 1-100 for 1000 iterations, recalculating the mean each time. For
each of our 22 groups we ended up with 1800000 independent estimates of
means across the sample size range of 1-100. At each sample size we then
calculated the standard deviation as a metric representing the amount of
variation introduced due to sample size. As expected, as sample size
increased standard deviation decreased, however, the amount of variation
at low sample sizes was dependent on the original parameters of the
dataset (Figure B2 A&B).
To determine at which sample sizes we observed a stabilization of
variation (i.e., when adding more samples did not significantly affect
the estimates of SD) we grouped sample sizes into sets of 5 (e.g., 1-5,
6-10, 11-15, etc.) and calculated the derivatives for each of these
groups across the sampling range. We used a one-sample t-test to
determine at which sampling range the derivative no longer significantly
differed from zero using a conservative cut off of non-significance withp =>0.1. Here, a derivative of zero represents when
variation in SD is no longer significantly affected by the addition of
more samples (i.e., when variation is stable). We determined at which
sample size we first observed non-deviance from zero for each of the
observed parameters from our dataset and visualized these results using
a histogram (Figure B2.C) Overall we found that the median sample size
where stabilization occurred was at 20 individuals (range: 5-35
individuals; Figure B2.C).