Genetic diversity and demographic history
Genetic diversity (θπ ) was calculated for different site categories across the genome based on the publicly available genome annotation in MonarchBase (Zhan & Reppert, 2013). This was done separately in windows of 10kb for eastern and western monarchs. Genomic positions were categorized as intergenic, intronic, 1st, 2nd, 3rdcodon positions and 4-fold degenerate sites (4D).
We used two different approaches to analyze demographic history. First, we used a diffusion approximation method of ∂a∂i (Gutenkunst, Hernandez, Williamson & Bustamante, 2010) to investigate the joint demographic history of eastern and western monarchs. For this analysis we only considered autosomal scaffolds as Z chromosomes have different effective population sizes and a low density of SNPs, which could affect the resulting site frequency spectrum. We generated a Two-Dimensional Joint Site Frequency Spectrum (2D-JSFS) of eastern and western monarchs using the “dadi.Misc.make_data_dict_vcf” function provided with ∂a∂i. We scanned for the likelihoods of a set of 15 demographic models to address the following questions (Fig. S9): (1) have eastern and western monarchs diverged in the past; (2) if there is divergence, is there migration between the two populations; (3) if there is migration, what is the rate of migration; (4) what is the most likely scenario to explain changes in effective population size? We simulated 4 two-population models provided with ∂a∂i (Fig. S9-L to S9-O) and 11 two-population models provided with dadi_pipeline (github.com/dportik/dadi_pipeline) (Portik et al. , 2017). We used log likelihoods to find the most likely model that can explain the joint site frequency spectrum of eastern and western monarchs. We performed parameter optimization by running 100 iterations performed in a total of 4 rounds (10, 20, 30 and 40 iterations for the first, second, third and fourth round, respectively). Parameters with the highest log-likelihood were used as starting parameters of the next round. We used the Broyden Fletcher Goldfarb Shanno (BFGS) algorithm to optimize the parameters. Results of all 16 optimized models were summarized using “Summarize_Outputs.py” provided with “dadi_pipeline”. This script extracts the iterations with the highest log-likelihood and compares them between models. The model with the highest log-likelihood score and the lowest Akaike Information Criterion (AIC) was considered the most likely model to explain the 2D-JSFS.
As a second approach to understand the demographic history of eastern and western monarchs, we analyzed the genome-wide patterns of Tajima’s D (TD ) calculated in windows of 10 kb using Vcftools v. 0.1.15 (Danecek et al. , 2011) to determine if eastern and western monarchs have different demographic histories.