2.4.3 | Case-study population genetics datasets
We investigate, as two separate case studies, the admixture histories of
the African American (ASW) and Barbadian (ACB) population samples from
the 1000 Genomes Project Phase 3 (1000
Genomes Project Consortium, 2015). Previous studies identified, within
the same database, the West European Great-Britain (GBR) and the West
African Yoruba (YRI) populations as reasonable proxies for the sources
of both ACB and ASW, consistently with the macro-history of the
Transatlantic Slave-Trade (Baharian et al.,
2016; Martin et al., 2017;
Verdu et al. 2017).
Samples in the 1000 Genomes Project were a priori sampled to be
family unrelated. To avoid confounding factors due to cryptic
relatedness in our sample compared to MetHis simulations, we
excluded individuals more closely related than first-degree cousins in
the four populations separately using RELPAIR (Epstein, Duren, &
Boehnke, 2002), as previously done (Verdu et al. 2017). We also excluded
the three ASW individuals showing traces of Native American or
East-Asian admixture, as reported in previous studies
(Martin et al., 2017). Among the
remaining individuals we randomly drew 50 individuals in the targeted
admixed ACB and ASW, respectively, and included the remaining 90 YRI
individuals and 89 GBR individuals.
We extracted biallelic polymorphic sites (SNPs as defined by the 1000
Genomes Project Phase 3) from the merged ACB+ASW+GBR+YRI data set,
excluding singletons. Since MetHis only simulates independent
markers, we LD-pruned the ACB and ASW SNP-sets using the PLINK
(Purcell et al., 2007) –indep-pairwise
option with a sliding window of 100 SNPs, moving in increments of 10
SNPs, and r2 threshold of 0.1. Finally, we randomly
drew 100,000 SNPs from the remaining SNP-set.