Materials and methods
Study region
Our study was conducted in the southern part of the Korean Peninsula,
which is located the East Asia region (33°–38° N, 125°–131° E; Fig.
1). The total area of the study region is 100,033 km2,
and the human population is 51 million (Ministry of Land Infrastructure
and Transport, 2016). The mean annual air temperature range ranges from
10–15 °C, and the range of the mean annual precipitation is 1000–1900
mm (Korean Meteorological Administration, 2020).
The Korean Peninsula is situated adjacent to the west Pacific Ocean and
is surrounded by water in three directions—east, west, and south. It
is a temperate region with four distinct seasons associated with the
East Asian monsoon that occurs on the far eastern side of the Asian
continent (Yi, 2011). Winter (December–February) is cold and dry
because of the formation of the strong Siberian anticyclone from the
Tibetan Plateau, while summer (June–August) is hot and humid, with
around 70% of the annual precipitation focused in this period (Korean
Meteorological Administration, 2020) (Appendix A2). Although the Korean
Peninsula is located at the eastern edge of the Eurasian continent, the
humid air supplied from the Yellow Sea to the west affects the diversity
and distribution of local plants.
The Korean Peninsula comprises a large number of mountains centered in
the Baekdudaegan Mountain Range, and only 22.5% of the peninsula is
flat land (Appendix A3) (Ministry of Land Infrastructure and Transport,
2016). In addition, although the elevation is not generally high, the
region displays complex tectonic characteristics with a relatively
diverse topography. Because the altitude gradients are shallower than
those in other regions in East Asia, the borders between mountains and
plateaus are relatively indistinct, making the region well suited for
the spatiotemporal movement of plants (Ministry of Land Infrastructure
and Transport, 2016).
There are around 4,300 known species of vascular plant on the Korean
Peninsula (with approximately 3,000 species in the southern part),
including 280 species of pteridophyte, 53 species of gymnosperm, and
3,963 species of angiosperm. In terms of specialized genera,Pentactina , Echiosophora , Abeliophyllum ,Hanabusaya , Mankyua , and Megaleranthis are present.
According to the Whittaker biome classification (Whittaker, 1962), the
southern part of the Korean Peninsula is mostly occupied by temperate
seasonal forest biomes but may also contain some temperate rain forest
and woodland/shrubland biomes (Fig. 1). In terms of the remnant
vegetation landscape of the Korean Peninsula, strong policies to promote
agriculture throughout the Joseon Period (1392–1910) led to a large
decrease in forests and an increase in grassland and shrubland habitats.
Later, in the southern part of the Korean Peninsula, the South Korean
government pursued policies to promote forests from the 1970s, resulting
in most natural habitats being located in forests (Cho et al., 2018).
Currently, approximately 30.3% of the southern part of the Korean
Peninsula is urbanized or used for agriculture and 63.8% is occupied by
forests, with other land covers accounting for the remaining 5.9%
(Ministry of Land Infrastructure and Transport, 2016).
Plant distribution data
We used vascular plant distribution data based on specimen and
coordinate data for plants collected between 2003 and 2015 in the
southern part of the Korean Peninsula (Korea National Arboretum, 2016).
The vascular plant distribution maps contained coordinate data for
309,333 specimens, corresponding to 2,954 taxa in 175 families and 919
genera. For analysis, a grid system was overlaid on a national
topographic map to combine the taxonomic groups located in each cell of
the grid (cell size, 11.2 km × 13.9 km) with the location coordinates in
a single data set (Graham and Hijmans, 2006; Lenormand et al.,
2019). All 771 grid cells were used in the analysis, but some
large urban regions were excluded from the floristic survey conducted by
the Korea National Arboretum, and so these were left as empty cells.
Analysis of floristic assemblage clusters and
characteristics
Using distribution data for the 771 grid cells and 2,965 plant taxa, a
SOM training data set was constructed in the form of a presence-absence
matrix (771 rows × 2,954 columns) (Fig. 2). The ‘kohonen’ R package was
used for the SOM algorithm (Wehrens and Kruisselbrink, 2018), and the
output layer was composed of 81 output nodes arranged in a square
lattice. To determine the types, hierarchical cluster analysis was
applied to the weight vectors of the SOM map units after conversion to
Euclidean distance metrics (via the function hclust in R using the
complete linkage method). The optimal number of types was calculated by
applying the silhouette coefficient to the range of 2–15 types
(Rousseeuw, 1987). In mapping the regionalization results, the grid
cells that were empty because of exclusion from the survey were filled
using the maximum frequency value from the surrounding eight cells.
Since some island regions (Ulleungdo and Dokdo) showed heterogeneous
values because of their distance from the adjacent grid cells, mapping
was performed using type values within the local range.
The correlations in species composition between the floristic zones were
analyzed using Venn diagrams (in the ‘VennDiagram’ package) based on
lists of species in each zone (Chen and Boutros, 2011). After producing
species catalogs for each zone, the common taxa (those appearing in all
zones) and specific taxa (those appearing in only specific zones) were
distinguished. Then, floristic compositions were investigated by
analyzing the identification of specific taxa at the family level.
Environmental data and
analysis
Geographic and climate factors were analyzed as macro-environmental
factors, using the defined floristic zones. For geographic factors, the
latitude and longitude were used, and for climate factors, air
temperature and precipitation data—provided by the Korean
Meteorological Administration (2020) and collected from 583 points
between 1970 and 2010—were used. In addition to the direct
environmental data, the warmth index (WI) and coldness index (CI) were
calculated and used as indirect climate data (Kira, 1945) (Eq. 1 and 2).
The values for these environmental factors were converted to values
covering the whole southern part of the Korean Peninsula by linear
interpolation, accounting for topography and altitude, with ArcGIS
program (ver. 10.0). The mean values for the environmental factors in
each grid cell were then calculated and used in the analysis.
\begin{equation}
\text{WI}=\sum_{1}^{n}{\left(t-5\right):t>5}\nonumber \\
\end{equation}\begin{equation}
\text{CI}=-\sum_{1}^{n}{\left(t-5\right):t<5}\backslash n\nonumber \\
\end{equation}As physical factors affecting plant distribution, parent materials,
topography, effective soil depth, and soil texture for the southern part
of the Korean Peninsula were used (Rural Development Administration,
2010). Parent materials were categorized as acidic rock, metamorphic
rock, sedimentary rock, quaternary deposit, volcanic ash, and other;
topography was categorized into mountain, hill, pediment, interrill
area, fan, lava terrace, or other; effective soil depth was categorized
into four classes (<20 cm, 20–50 cm, 50–100 cm, and
>100 cm), and soil texture was categorized as sandy gravel,
silt and sandy loam, clay loam, and clay (Appendix A4).
To test the effect of environmental factors on the floristic composition
and zonation, the geographic and climate (mean annual temperature,
annual precipitation, warmth index, and coldness index) data were
analyzed using the one-way analysis of variance (ANOVA) and Tukey’s
tests (Zar, 1984). The categorical physical factors (parent materials,
topography, effective soil depth, and soil texture) were analyzed using
box plots for each zone. The “ggplot2” R package was used for data
visualization (Wickham, 2016). Statistical analyses were performed using
R (R Core Team, 2019).