Materials and Methods
We sampled a total of 416 bullfrogs (Lithobates catesbeianus ) from 11 different countries between June 2013 and December 2017 (Supporting Information 1). Samples were obtained from feral specimens, museums and private collections; and include samples from both populations in Brazil collected by Jorgewich-Cohen et al. (in press). Tissues from liver, muscle or skin were removed and stored in 90% ethanol at -20°C. DNA extraction was performed with the DNeasy Blood and Tissue extraction kit (Quiagen, Valencia, CA, USA), following the manufacture’s guidelines.
Seven microsatellite loci were amplified following Austin et al. (2003), with PCR conducted on a Veriti™ thermal cycler (Applied Biosystems) using the following thermal profile: 95°C for 7 min followed by 10 cycles of 95°C for 30 s, touchdown from 62–57°C for 45 s, and 72°C for 30 s, followed by 30 cycles of 95°C for 30 s, 50°C for 30 s, and 72°C for 30s, and final extension at 72°C for 7 min. The reaction mix, with a total volume of 10 µl, contained 1.0 µl of buffer, 0.5 µl 2 mM dNTPs, 0.5 µl fluorescent dye (VIC for RcatJ11 and RcatJ44b; NED for RcatJ21 and RcatJ41; PET for RcatJ54 and Rcat3-2b or 6-FAM for RcatJ8; applied biosystems), 0.5 µl of mixed forward and reverse primers (5 µM), 0.125 µl Taq polymerase, 3.375 µl distilled deionized water, and 3.0 µl template DNA. We diluted the PCR products to a proportion of 1:4 and submitted them to sequencing by a third party.
We used Gene Marker ver. 2.6.3 (SoftGenetics) to score results and Microchecker (Van Oosterhout et al., 2004) to test for null alleles, allele dropout, and stuttering. Around 22% (91/416) of the samples were genotyped again in order to evaluate the percentage of homozygotes that were actually null alleles. Control samples were included in every procedure.
To assess current genetic structure, we performed a discriminant analysis of principal components (DAPC; Jombart et al., 2010) using the adegenet package (Jombart, 2008) on platform R (R Core Team, 2017). Samples were grouped by country of origin in cases where we could not sample in more than one site or either when we could not get enough samples to run the analysis without facing bias by using too different sample sizes between locations. Samples from Brazil were divided based on the known genetic population structure in the country (Jorgewich-Cohen et al., in press).
In order to quantify the pairwise genetic differentiation between populations observed on the DAPC, we used Jost’s D index (Jost, 2008). Next, we tested the significance of the differentiation between pairs of populations with 10000 permutations. We subjected the results to a Benjamini and Hochberg procedure (Hochberg & Benjamini, 1990) in order to control false discovery rates (FDR) and avoid type I error (Hauser et al., 2019). These analyzes were conducted using the DEMEtics R package (Jueterbock et al., 2010). To calculate the diversity indices: expected (He) and observed (Ho) heterozygosity, number of alleles (A) per locus, effective number of alleles (Ae) per locus, and size corrected Wright’s inbreed coefficient (Fis), we used the gstudio package (Dyer, 2014), also on platform R. We checked for Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) deviations through a permutation procedure with 100 batches of 1000 iterations, by performing a probability test in Genepop ver. 3.4 (Rousset, 2008). P values were corrected with the Benjamini and Hochberg procedure.
Samples that showed graphic overlap on the DAPC were considered the same genetic population for further analysis. Based on those results, we selected the most variable samples from each genetic population (Supporting Information 2) and amplified a 1047bp segment of the mitochondrial cytochrome b gene (cytb) that includes the fragment used by Austin et al. (2004). We used a combination of the primers MVZ15L (Moritz et al., 1992) and cyt-bAR-H (Goebel et al., 1999) and a thermal profile for polymerase chain reaction that consisted of 95°C for 10 min, followed by 45 cycles at 95°C for 30 s, 50°C for 40 s, and 72°C for 40 s, with a final extension step at 72°C for 5min. The reaction mix, with a total volume of 25 µl, contained 0.15 µl Go Taq G2 Flexi DNA Polymerase (Promega corporation), 2.5 µl of Go Taq flexi Buffer, 1.0 µl 2 mM dNTPs, 2.0 µl 25 mM MgCl2, 1.0 µl of each primer (10 pM), 15.35 µl distilled deionized water, and 2.0 µl template DNA. PCR amplification products were cleaned using Agencourt AMPure XP DNA Purification and Cleanup kit (Beckman Coulter Genomics, Brea, CA, USA), and they were sequenced by a third party using fluorescent-dye labelled terminators (ABI Prism Big Dye Terminators v. 1.1 cycle sequencing kits; Applied Biosystems, Foster City, CA, USA) with an ABI 3730XL (Applied Biosystems, Foster City, CA, USA).
We used Arlequin 3.5 (Excoffier & Lischer, 2010) for most genetic analysis. Two diversity indices were calculated: Hd and π, described by Austin et al. (2004) respectively as “the relative frequencies of haplotypes in a population, without consideration of their relationships” and the “weighted mean of pairwise divergence among haplotypes”. We also calculated the pairwise differences between the introduced populations using θST index and evaluated its significance by performing 10,100 permutations. P values were corrected under the Benjamini and Hochberg procedure (Hochberg & Benjamini, 1990). Significant differentiation between populations were interpreted as cases of different native populations descendants, as it has been done before (Funk et al., 2011; Kamath et al., 2016). We used analysis of molecular variance (AMOVA) to assess the native origin of introduced populations. Information on native range is available at Austin et al. (2004), where four groups were delimited through a nested clade analysis. To perform AMOVA, we gathered populations that showed non-significant differentiation on θST analysis and compared them with each of four source populations. We also performed a significance test with 10,000 permutations. The smallest value found between the four AMOVAs was considered the indication of origin for the tested population (Bai et al., 2012; Ficetola et al., 2008; Funk et al., 2011; Kamath et al., 2016).
We used Geneious ver. 10.2.3 (Kearse et al., 2012) for sequence edition and contig formation of the cytb sequences based on the chromatograms obtained from the automated sequencer. All samples were sequenced in both directions to check for potential errors. The sequences were aligned with the MAFFT (Katoh & Standley, 2013) plugin in Geneious 10.2.3 (Biomatters) with the G-INS-I strategy. We trimmed the sequence alignment to a length of 937bp.
For better understanding haplotype relationship, we performed a phylogenetic analysis under a maximum-likelihood (ML) based approach in RAxML ver. 8 (Stamatakis, 2014) with 100 runs, and ML branch support was evaluated by 10,000 bootstrap replicates, following the same phylogenetic method conducted by Kamath et al. (2016). We used unique haplotypes from our data and added haplotypes from Austin et al. (2003, 2004), Ficetola et al. (2008), Bai et al. (2012), and Kamath et al. (2016) to compare the relationships among haplotypes from introduced and native populations, and between those haplotypes that have not been sampled in the native range. Tree rooting was based on the outgroup method (Farris, 1982; Nixon & Carpenter, 1993), and outgroup selection was based on a previous phylogenetic analysis (Pyron, 2014). We usedLithobates septentrionalis to root the tree and L. clamitans , L. okaloosae , and L. heckscheri as additional outgroup species (respective GenBank accession numbers: AY083273, AY083281, AY083286, AY083299).