Parameter estimation – Bayesian sampling
MSOM was implemented using Bayesian statistics in the JAGS language
(Plummer & others, 2003). Bayesian models take prior probabilities
(beliefs; priors) of each parameter (e.g. occupancy of a species
in a patch) and use the observed data to update these probabilities –
posterior probabilities, thereby generating a distribution of
probabilities for each parameter of interest. This differs from
traditional maximum likelihood estimators, which provide a single
best-fit estimate. Null models/hypotheses and p-values do not exist in
Bayesian statistics.
To define a threshold for ”significance”, we calculated the Highest
Density Interval (HDI) for each model parameter (e.g. the parameter
defining the effect of habitat amount on species occupancy). HDIs were
calculated as the interval where 95% of parameter posterior probability
lies. We also defined a Region of Practical Equivalence (ROPE), ranging
from -0.1 to 0.1, to determine the parameter space where an effect would
be weak/equivalent to zero from a biological perspective. All
comparisons of HDI and ROPE were performed using the coefficients
obtained with the standardized predictor variables. Based on HDI and
ROPE, three possible outcomes were defined for the effect of each
predictor variable (see Fig. S2):
1) No effect, when HDI is completely within the Region of Practical
Equivalence;
2) Lack of support, when HDI partially overlaps the Region of Practical
Equivalence;
3) Evidence of effect, when HDI lies completely outside the Region of
Practical Equivalence.
In our data, the [-0.1; 0.1] limits of ROPE indicate that the area
effect was considered “significant” only if species occupancy
increases or decreases at least 0.00268 with 1-ha of additional patch
area at the steepest part of the Species-Area Relationship (assuming S =
cAz). Similarly, an absence of practical effect occurs
when the effect is weaker than this 95%-probability threshold.
Regarding diversity, these intervals would represent an increase or
decrease in 16% in species richness comparing the smallest (2.4 ha) to
the largest patch ( 144,805 ha). This interval is conservative
considering most of the conservation implications of the results, and
therefore ”significant” effects represent strong associations with a
high degree of certainty. We also include in results and discussion the
exact number of species estimated to be gained or lost with the change
in patch area or cattle intrusion, and provide the exact estimate
whenever possible.
To estimate parameters, we calculated the posterior probability of
parameters as:
In this equation, y represents the observed data (counts of individual
species at each individual Winkler extractor), L(y | Ψ, p)
represents the likelihood function, and P(Ψ) and P(p) denote the prior
distributions for occupancy and detection rates. The likelihood function
is defined as:
where N represents the number of patches surveyed, m is the number of
Winkler extractors per patch (m = 10 for all patches), and
P(z|Ψ) is the probability of the latent variable z that defines
the true occurrence of a species per patch:
The joint posterior probability captures the simultaneous estimation of
both occupancy and detection rates based on the observed data. The
species-specific occupancy probability (Ψ) and detection probability (p)
for species s at a given fragment i depend on the effect
of environmental covariates:
The priors for the species-specific coefficients associated with the
occupancy probability (Ψ) and detection probability (p) are assumed to
follow Gaussian distributions with overall means and standard
deviations:
These priors reflect the expected values and variability of occupancy
and detection rates across species (random effects). On the other hand,
the hyperpriors (fixed effects) capture the overall patterns across
species by specifying the prior distributions for the overall mean of
the occupancy probability. The hyperpriors provide a way to incorporate
prior knowledge or assumptions about the underlying patterns in
occupancy and detection across the entire species assemblage. All model
priors were defined as non-informative/flat priors (no prior expectation
on model parameters, totally informed by the data):
Posterior probabilities were calculated using three Monte Carlo Markov
Chains (MCMC) with 50,000 steps each. The chains were initiated at
different parameter values and chain convergence was checked by visual
inspection and using the Gelman-Rubin statistic (Gelman & Rubin, 1992;
convergence in parameters with R-hat lower than1.05). To remove the
effect of initial parameters on the final estimates, a burn-in phase of
5,000 steps was run in each chain. The three chains were run in parallel
(simultaneously in three computer cores) with different random number
seeds.
All predictor variables were standardized (mean = 0, sd = 1) before
analyses. Patch area was log-transformed before standardization and
analyses. Untransformed measures of habitat amount (forest cover within
a buffer) were more strongly correlated with the observed species
richness, so these variables were not log-transformed (but log
transformation provided almost identical results; results not shown).
Because patch size, forest cover, and cattle presence were highly
correlated with each other (r > 0.7), we present results
using area and cattle in the same model and also each of these variables
separately. Using non-standardized predictor variables produced
qualitatively the same results (Fig. S3 in Supporting Information). We
present the results using standardized predictors in the main text to
allow for direct comparisons of the magnitude of effect directly based
on the coefficients.
Effects on individual species
On the basis of MSOMs, we obtained detection, occupancy, and the effect
of predictor variables on detection and occupancy for each species. We
modeled individual species responses to each predictor as a random
effect, centered around a community-wise mean. To illustrate the
consistency or variability of these responses across species, we present
the standard deviation of the random effects around the parameter
estimates in the results. This allows us to showcase how species’
responses may be consistent or variable in relation to the predictor
variables. In all models, species responses to habitat loss and other
predictors was highly variable (see Results). To estimate how species
traits are associated with their detection, occupancy, and responses to
covariates (i.e. why some species are more common and/or more affected
by habitat loss than others), we ran separate regressions and analyses
of variance (ANOVAs) associating species body mass and foraging strategy
(predictor variables) with species detection, occupancy, and their
responses to environmental predictors. Here, individual species, rather
than forest patches, represent units of analysis.
Regressions and ANOVAs were run using Bayesian statistics using a
similar setup to the occupancy models described above. Species traits
were not directly incorporated into MSOMs to facilitate model
convergence.
Alpha, beta, and gamma diversities
In each patch, species richness was estimated as the sum of individual
species occupancies obtained from MSOMs (alpha-diversity; sum of rows in
the estimated species x patch matrix). Regional species richness
(gamma-diversity) was calculated as the sum of all species occupancies
regionally (number of species with at least one occurrence in the
estimated species x patch matrix). We measured beta-diversity for each
pair of patches using the turnover component of the pairwise Jaccard
similarity index (Baselga 2010), which was calculated using the
bias-corrected occupancy matrix obtained from MSOM (i.e. presences
corrected for imperfect detection). The Jaccard similarity index is
measured as the percentage of shared species between a pair of patches,
which is likely to be grossly underestimated using raw occurrence data
because many shared species are rare and go undetected in at least one
of the sites in a pairwise comparison (Chao et al. , 2005; Chiuet al. , 2014). Moreover, the classic Jaccard index results from
differences in species richness (nestedness) and true species turnover
between any two patches (Baselga, 2010). The turnover component
represents species replacements (beta-diversity) and was the only
pairwise metric used (but see Supporting Figures for further details and
comparisons with classic metrics).
To estimate the effect of predictor variables on alpha-diversity, we
correlated species richness with the predictor variables. These
correlation coefficients were used to measure the strength of the
association, and the posterior probabilities of the correlation
coefficient were used to assess the “significance” of this
association.
To estimate the effect of predictor variables on pairwise
beta-diversity, we created pairwise environmental distance matrices to
represent the predictors of species turnover. These matrices were
created by calculating pairwise differences for each predictor variable
across all patches (e.g. two cattle-trampled patches have a difference
of zero for cattle intrusion). These pairwise differences were then
correlated with the pairwise turnover of the Jaccard similarity index
using a multiple regression of distance matrices (similarly to Mantel
tests; Tuomisto & Ruokolainen, 2006). Similarity in species composition
is usually strongly associated with geographic distance (Nekola &
White, 1999), thus distances separating patches were included as a
predictor of the Jaccard index in all beta-diversity models along with
differences in patch area, cattle intrusion, connectivity, and habitat
amount. Because our study was conducted on a single landscape, gamma
diversity across the entire landscape was represented as a single value,
which cannot be directly used as a response variable in statistical
models.
Although the association between alpha- or beta-diversity and predictor
variables informs the degree to which these variables affect
regional/landscape diversity, they fail to reveal the relative
contribution of each component to regional diversity or how these
contributions change with the amount of habitat across the landscape or
the size of individual patches. Therefore, to determine if the effect of
fragmentation on beta-diversity can compensate for losses in
alpha-diversity within patches one should investigate the combined
contribution of alpha and beta-components (Socolar et al. , 2016).
To understand the combined contribution of alpha- and beta-diversity to
regional/gamma diversity with increasing spatial scales (scaling
relationship), we ordered each patch from smallest to largest. We then
sequentially grouped subsequent patches and calculated the maximum and
average patch diversity (alpha) and overall/landscape diversity (gamma).
Gamma diversity increases when either alpha-diversity increases or
species composition diverges between patches (beta-diversity).
Beta-diversity can be calculated using the additive (gamma = alpha +
beta) or multiplicative (gamma = alpha × beta) partitioning of gamma
diversity (Jost 2007). We used additive diversity partitioning to
estimate the absolute number of species added by beta-diversity
to the regional diversity with increasing habitat amount across the
landscape, and multiplicative partitioning to estimate therelative contribution of beta-diversity to regional diversity.
Species diversities (alpha, beta, and gamma) were calculated for each of
the MCMC runs of MSOM, generating a distribution of values and density
intervals for these diversity components and their scaling relationships
(posterior probabilities).
The sequential grouping of patches allowed us to create cumulative areas
and estimate diversities at each grouping of patches. By doing so, we
were able to examine how alpha and beta-diversities changed from a
single small forest fragment to the entire landscape, capturing the
effects of habitat expansion and increasing spatial scale.
Species-Area Relationship (SAR) and landscape configuration
We did not only estimate the contribution of species turnover to gamma
diversity but also compared the relative contribution of smaller patches
to regional diversity based on the Species-Area Relationship (SAR). To
do this, we used the model proposed by Arnillas et al. (2017) to compare
the observed SAR with the expectation based on a unitary community,
assuming free species dispersal and no spatial segregation. This allowed
us to assess how the presence of smaller patches influenced regional
diversity compared to the theoretical expectation under idealized
dispersal and spatial conditions. This comparison is summarized by R,
the relative difference between the observed and expected SAR:
In this equation, forest parches are sorted from largest to smallest;
pi represents the probability of a patch having a
species that is absent from larger patches; c is a constant; z is the
species-area exponent; and Ai represents patch area. The
numerator represents the number of species in a region with the observed
SAR, and the denominator represents the expected number of species based
on a uniform distribution of species across the landscape. c and z were
estimated using a linear regression between the species richness (log S)
and patch area (log A).
The R values range from R<1 when communities are nested
(smaller patches have a subset of species from continuous forests), R =
1 when small and large communities are equivalent (small areas can still
add species due to area effects), to R>1 when smaller
patches have a distinct set of species and add more species than would
be expected by chance based on landscape area alone (Arnillas et al.,
2017).
All analyses were conducted in R (R Core Team, 2019) and JAGS (Plummer,
2019) programs using the rjags (Plummer, 2019), runjags(Denwood, 2016), reshape2 (Wickham, 2007), raster (Hijmanset al. , 2015), and vegan (Oksanen et al. , 2018)
packages. All statistical models created in these analyses are provided
with a tutorial on how to run MSOM in R (see Supporting Information).