Methods

Study area

Our study focused on the Southern Africa region, comprising ten countries: Angola, Botswana, Eswatini, Lesotho, Malawi, Mozambique, Namibia, South Africa, Zambia, and Zimbabwe, with a land area of around 4,000,000 km2 (Bezeng, et al. 2015). The history of plant modern introductions in Southern Africa dates back to the late 18th century when European settlers arrived in the region (Wells, et al. 1986). The lattest global IPCC report (IPCC 2023) shows that southern Africa is likely to become substantially hoter, while precipitation is likely to decrease in most regions. With the predicted changes in temperature, precipitation regimes, and water availability, Southern Africa is expected to become one of the global climate-change hotspots (Hoegh-Guldberg, et al. 2019).

Species selection and occurrence records

For our study, we used a list of cultivated alien plants of Southern Africa extracted from (Cultivated Plants of Southern Africa; Glen 2002). The initial list included more than 5,316 taxa that are described to be cultivated in at least one region in Southern Africa. To harmonize the list of cultivated alien plants of Southern Africa with other datasets used in this study (see below), we standardized the names of the species following The Plant List (version 1.1; http://www.theplantlist.org) using the R package ‘Taxonstand’ (Cayuela, et al. 2019). Intraspecific taxa (varieties and subspecies) were merged at the species level to reduce complexity. The resulting list, therefore, consisted of 5,212 cultivated alien plants with accepted names.
We collected occurrence data on the global distribution of these species from the Global Biodiversity Information Facility (GBIF.org 2021; https://doi.org/10.15468/dl.9jsscb) using the ‘rgbif’ library in R (Chamberlain, et al. 2021). To account for the full realized niche of the species, we considered native and introduced occurrences globally (Early and Sax 2014, Fernández and Hamilton 2015, Pearman, et al. 2008). Erroneous records (e.g., those that occur on ocean surfaces due to possible georeferencing errors and those in capitals, where they might have been planted) were automatically removed using the ’CoordinateCleaner’ library in R (Zizka, et al. 2019). Additionally, we removed duplicate data points (that is, multiple occurrence records within each 10’ × 10’ grid cell, ~ 20 x 20 km) for bias correction. The resulting species list, therefore, consisted of 1,527 species with at least 50 occurrences per species from their native and alien ranges.

Climatic data

We retrieved global climate data from WorldClim version 2.1(10’ resolution for the period 1970-2000) (Fick and Hijmans 2017). From all the available bioclimatic variables, we selected the following five: (1) Temperature Seasonality (standard deviation ×100), (2) Max Temperature of Warmest Month (°C), (3) Precipitation of Wettest Month (°C), (4) Precipitation of Driest Month (mm), (5) Precipitation Seasonality (coefficient of variation). We selected these variables because they are known to strongly affect plant distributions (Root, et al. 2003). In addition, we used human population density (person / 10’ × 10’ grid cell), available from the NASA Socioeconomic Data and Applications Center, as an interaction term with nativeness as an indicator of propagule pressure (Gao 2020). Moreover, all explanatory variables have pairwise Pearson’s r values < 0.7 (Supporting Information Fig. S1), limiting the risk of biased model estimates due to multicollinearity (Dormann, et al. 2013).
To represent possible future climatic conditions, we used projected climate data for the period 2081–2100 (means of the above-listed climatic variables), again retrieved from WorldClim version 2.1 (Fick and Hijmans 2017). We also used human population density projection for the year 2100, retrieved from NASA Socioeconomic Data and Applications Center. We used two Shared Socioeconomic Pathways (SSPs) to characterize future climate conditions, specifically SSP1 and SSP5 ­­­– to represent a best-case scenario (the sustainability/taking the green road scenario) and a worst-case scenario (fossil-fueled development/taking the highway scenario), respectively (O’Neill, et al. 2017, Riahi, et al. 2017). Because different global circulation models (GCMs) significantly affect species range projections, we selected three GCMs for each SSP scenario, namely CanESM5, CNRM-ESM2-1, and MIROC6. According to The Inter-Sectoral Impact Model Intercomparison Project (Lange 2019), these GCMs represent relatively low, moderate, and high global projected mean precipitation and temperature.

Data on naturalization status, native origins and biomes

To analyze whether the potential current and future climatic suitability differ according to plants’ naturalization status, biogeographical origin and biome within Sothern Africa, we first extracted the naturalization status of each species (that is, cultivated but not yet naturalized or cultivated and naturalized) using the latest version of the Global Naturalized Alien Flora (GloNAF) database (van Kleunen, et al. 2019). Second, we used the nine level-1 regions of the World Geographical Scheme for Recording Plant Distributions of the Taxonomic Databases Working Group (TDWG; Brummitt 2001) to identify the native geographical origin of each species. This data was extracted from the Germplasm Resources Information Network (GRIN; https://ars-grin.gov), the World Checklist of Selected Plant Families (WCSP; http://apps.kew.org/wcsp), and the Plants of the World Online database (POWO 2019); http://www.plantsoftheworldonline.org/). Finally, we assigned each 10’ × 10’ grid cell in Southern Africa to one of the biomes defined by Dinerstein, et al. (2017) (Supporting Information Fig. S4).

Species distribution modeling

To define the current and future potential climatic suitability for the cultivated alien plants in Southern Africa, we combined the bioclimatic variables with presence records and randomly generated pseudo-absence data using the BIOMOD2 platform as implemented in the ’biomod2’ R package version 3.4-6 (Thuiller, et al. 2020). We used four modeling algorithms: i) two regression techniques (that is, i) generalized linear model (GLM) and ii) general additive model (GAM)) and two classification techniques: iii) random forest (RF) and iv) boosted regression trees (BRT). We kept the default argument settings of these four modeling algorithms in biomod2. For all the models, we randomly generated three sets of 10,000 pseudo-absence records from the GBIF presence-only data. All models were fitted using an equal weight of presences and pseudoabsences. Finally, to evaluate our models, each model was separately run three times using a random split-sampling approach in which data was split into 80% calibration and 20% evaluation datasets for each of the three pseudo-absence datasets (resulting in nine models per modeling algorithm and a total of 36 models for each species). We used the True Skill Statistic (TSS) of Allouche, et al. (2006) to assess the predictive performance of the SDMs. TSS values range from −1 to 1, where 0 indicates a random prediction, negative values indicate that predictions perform worse than random, and 1 indicates perfect agreement.
We then used the calibrated models to project the current and future climatic suitability in Southern Africa using a weighted mean ensemble forecast (Thuiller, et al. 2009). To do that, we first aggregated all the models of the repeated pseudo-absences and split-sampling into an ensemble projection to reduce uncertainties associated with each technique. The contribution of each model was weighted according to its TSS score (we only included models with a TSS score > 0.5). Then, the mean weighted ensemble was transformed into binary maps using a threshold that maximized the TSS to predict presences and absences for the ‘current’ climate and each of the two climate change scenarios. Three binary projections were produced for each SSP scenario, one for each GCM. We then combined these three projections into one consensus map where each cell was identified as suitable when the majority of GCMs (that is, two of three) predicted it as suitable; otherwise, the cell was identified as unsuitable.
This subset of modeled species (n = 1,527) effectively represents the cultivated flora of Southern Africa. This conclusion is supported by the similar distribution of native origins between the modeled and unmodeled species (n = 3,685) (Supporting Information Fig. S2). With the exception of species native to Europe and Southern America, the proportion of species from each native origin was quantitatively consistent in the modeled species subset when compared to the entire pool of the cultivated flora.
We explored the potential impact of each bioclimatic variable on the future distribution of the cultivated alien species. To do so, we made predictions for future climatic conditions fixing one of the five predictors at its value of the reference period 1970–2000 in turn. We then compared these predictions to those of the fully adapted model (= all predictors set to future conditions) by computing the difference in the number of suitable cells. The rationale is that a predictor variable has more impact the more the two projected future distributions differ (Supporting Information Fig.S3).

Hotspot analysis

To identify potential invasion hotspots for cultivated alien plants in Southern Africa for each climatic scenario, we stacked the binary consensus maps of all 1,527 modeled species. We then calculated a per grid cell (10’ × 10’) sum of number of species that find suitable climatic conditions there. We defined invasion hotspots as the top 10% of grid cells that provide suitable climatic conditions to the highest numbers of species. To depict the potential contraction or expansion of invasion hotspots, we also identified the high-risk region under future climatic scenarios by applying the top 10% cut-off value (128 species) determined under current conditions to the future climatic scenarios.

Climatic suitability and biomes

We assessed the difference between biomes within Southern Africa with respect to changes in climatic suitability under climate change. To do this, we calculated for each grid cell in the different biomes of Southern Africa (see above) the number of alien species that encounter suitable climatic conditions under current and future scenarios. Then, we calculated the difference in number of alien species between current and future climate scenarios and divided it by the number of alien species under current climatic scenario (proportion of change). To test whether the potential number of alien species in each biome will, on average, increase, decrease, or remain constant under future climate change scenarios, we calculated the mean proportion of change for each group and then calculated the 95 % confidence intervals around these means with 1,000 bootstrap replications using the boot.ci function in the “boot” R package version 1.3-28 (Canty and Ripley 2021). We considered the mean proportion of change of each group to deviate from zero if the confidence intervals did not overlap with zero. We also tested if there are differences in the numbers of potential alien species among biomes. To account for the fact that each species can potentially occur in multiple biomes, we applied simple randomizations to determine whether the mean potential number of alien species in each biome deviated from those expected by chance (p < 0.05, two-tailed test; see Divíšek, et al. 2018, Omer, et al. 2021). Therefore, from the pool of all proportions of changes, we randomly drew 999 times as many species as are in each biome. We defined the observed mean proportions of change to differ from random expectations if it was in or beyond the lower 2.5% or upper 2.5% of the distributions of random draws.

Climatic suitability and naturalization status

Under both current and future climatic scenarios, 824 (which accounts for 53.9% of all modeled species) were predicted to lack suitable climatic conditions in Southern Africa. Therefore, we will not include these species in our subsequent analysis. Speices with modeled unsuitable climatic suitability may still be successfully cultivated in gardens and public spaces as under cultivation, weeding, irrigation and tendering may allow them to persist and flourish.
We tested whether the naturalization status of cultivated alien plants in Southern Africa is correlated with the size of the current potential range (= number of suitable 10’ × 10’ cells) and whether cultivated alien and naturalized plants differ in their response to climate change (change in the future range size compared to the current range size). We first calculated the difference in potential range sizes between current and future climate scenarios. Then we divided that difference by the potential range size under current climatic conditions (proportion of change). Negative and positive values indicate a net reduction or expansion in the climatically suitable area under climate change, respectively. Then, we fitted three generalized linear models (GLMs) with a binomial error distribution and a logit link function. For each GLM model, we set naturalization status as the binary response variable and the number of climatically suitable cells under the three climatic scenarios (that is, current and change in SSP1 and SSP5 to the current climate) as explanatory variables. To facilitate comparisons of the estimates within and between the models, we also scaled each explanatory variable to a mean of zero standard deviation of one (Schielzeth 2010).

Climatic suitability and geographical origins

We assessed whether current potential range sizes would, on average, increase, decrease, or remain constant under future climate change scenarios for species of different geographical origins. First, we calculated the mean proportion of change for each group and then calculated 95 % confidence intervals around these means with 1,000 bootstrap replications using the boot.ci function in the “boot” R package version 1.3-28 (Canty and Ripley 2021). We considered the mean proportion of change to deviate from zero if the confidence intervals did not overlap with zero. We also tested if there are differences in the projected range-size change among geographical origins. Again to account for the fact that each species can be native to multiple continents, we applied simple randomizations to determine whether the mean projected range size change of each geographical origin deviated from those expected by chance (p < 0.05, two-tailed test; see Divíšek, et al. 2018, Omer, et al. 2021). Therefore, from the pool of all proportions of changes, we randomly drew 999 times as many species as are in each geographical origin. We defined the observed mean proportions of change to differ from random expectations if it was in or beyond the lower 2.5% or upper 2.5% of the distributions of random draws.

Unmodeled species climatic suitability imputation

Due to the limited availability of the species’ geographical distribution data, we could only predict the current and future species distribution for 1,527, which represents just 29.2% of our entire pool of 5,212 cultivated species. Although we showed that this subset of modeled species is representative of the entire pool of cultivated flora (Supporting Information Fig. S2), we conducted further analysis to evaluate the impact of the unmodeled species on our conclusions. To address this, we used the result of the 1,527 modeled species to impute the climatic suitability for the unmodeled species in our pool of cultivated flora of Southern Africa. To do so, we fitted three separate linear models using the number of climatically suitable cells under the three climatic scenarios (that is, current and change in SSP1 and SSP5 to the current climate) as response variables. Naturalization status and geographical origins were used as explanatory variables. We then used the fitted values to predict the number of climatically suitable cells under the three climatic scenarios for species that were not included in the SDMs. Finally, we redid the analysis of how the change in climatic suitability is related to naturalization status and native origins using the pool of all species (including imputed species). The results using the entire pool of cultivated species show a more or less similar trend for the effects of naturalization status (Supporting Information Figure S6) and native origins (Supporting Information Figure S7).
All analyses were done in R, version 3.6.1 (R Core Team 2019).