2.3. Sequence Alignment and Phylogenetic Analysis
Nucleotide sequences were edited, assembled and aligned using the Muscle
plugin (Edgar, 2004) within the program Codon Code Aligner (v. 5.1.5,
Codon Code Corporation), primers sequences used for amplification were
excluded from the analysis, COI sequences were translated into amino
acid sequences based on the Ascidian mitochondrial code (translation
table 13) to further improve sequencing quality and screen for
frameshift mutations and stop codons.
Genetic polymorphism analysis was run for each population calculating
the number of haplotypes (Nh), haplotype diversity (h) and nucleotide
diversity (π) using DnaSP version 5.10.01 (Librado & Rozas, 2009).
Sequences of 18S rDNA were phased with the PHASE v2.1.1 algorithm
(Stephens & Donnelly, 2003; Stephens, Smith, & Donnelly, 2001) in
DnaSP using default parameters. Pair-wise FST among all
populations and AMOVA were calculated using ARLEQUIN v. 3.5.2.2
(Excoffier, Laval, & Schneider, 2005). The significance of the variance
components and pairwise FST values were assessed by a
permutation test with 10,000 replicates. To test Isolation by distance
in C. verrucosa populations, Mantel test with 1000 permutations
was performed using IBD Macintosh application v. 1.52 (Bohonak, 2002).
Scatter plot of geographical distance and genetic distance was performed
in R. The genetic distances among populations were expressed as
FST pairwise differences. The geographical distances
between populations were represented by the shortest coastline distance.
Species delimitation was carried out using the online version of
Automatic Barcode Gap Discovery, ABGD
(http://wwwabi.snv.jussieu.fr/public/abgd/) using Kimura
p-distance. ABGD delimits a “barcode gap” in the distribution of
pairwise differences (Puillandre, Lambert, Brouillet, & Achaz, 2012).
The haplotype network was created with the Haploviewer application
(available at www.cibiv.at/~greg/haploviewer), based on
multiple alignments of the sequences and on a neighbour joining tree
that was constructed using the software MEGA 7.0.21 (Kumar, Stecher, Li,
Knyaz, & Tamura, 2018) using 500 bootstrap replicates and p-distance.
For phylogenetic reconstruction, the best fitting model was determined
with the software jModeltest 2.1.9 v20160115, using Phyml v.3.0 with 88
candidate models. The best-fit model for COI was HKY85+G+ I, and for 18S
the best-fit model was HKY85+G+I, these substitutions models were
applied in Maximum Likelihood (ML) and Bayesian Inference (BI) analysis.
ML analysis was run using PhyML 3.0 (Guindon et al., 2010) using 1000
bootstrap replicates for both markers independently. BI analysis was run
using sing Markov Chain Monte Carlo (MCMC) simulations in Mr. Bayes 3.2
(Ronquist et al., 2012), sampling every 100 generations, samples of the
substitution model parameters were checked whether the likelihoods
reaching stationarity, and weather the standard deviation of split
frequencies was below 0.05. Mitochondrial COI reached stationarity with
a total of 500000 MCMC generations (split=0.04), while 18S with a total
of 200000 MCMC generations (split=0.02). The remaining trees sampled
were used to infer Bayesian posterior probabilities (BPP) at the nodes
and produce the consensus tree.
In order to estimate divergence time since speciation, the BEAST 1.8.0
software package was used to analyse COI sequences (Drummond, Suchard,
Xie, & Rambaut, 2012). First, xml files were generated using BEAUti to
execute them in BEAST. Data from other marine invertebrates were used as
a proxy since due to lack of adequate fossil records, no calibrated
mutation rates for ascidians for COI exist in the bibliography. Nydam
and Harrison (2011) estimated from data based on other marine
invertebrate taxa (crabs, shrimp, urchins), a mutation rate range of
0.016 to 0.026 substitutions per site / million years. Two independent
analyses were run, a first one using strict clock model with a
substitution rate of 0.016 substitutions per site / million years
(107 generations), and a second one at 0.026
substitutions per site / million years (107generations), for both a burn-in of 20 % was applied. The tree prior
was set to Yule speciation. The GTR+G substitution model was used. The
xml files were then executed in BEAST. Results were analysed using
Tracer v1.6.0 to check the convergence to a stationary distribution of
parameters.