1.1 | Maximum-likelihood methods to reconstruct
admixture histories
Two classes of maximum-likelihood (ML) methods have been extensively
deployed to infer admixture histories from genetic data. They rely on
the moments of allelic frequency spectrum divergences among populations
(Lipson et al., 2013;
Patterson et al., 2012;
Pickrell & Pritchard, 2012), and on
admixture Linkage-Disequilibrium patterns – the distribution of LD
within the admixed chunks of DNA inherited from the source populations
in the genomes of admixed individuals
(Chimusa et al., 2018;
Gravel, 2012;
Hellenthal et al., 2014;
Loh et al., 2013;
Moorjani et al., 2011). Notably, Gravel
(2012) developed an approach to fit the
observed curves of admixture-LD decay to those theoretically expected
under admixture models involving one or two pulses of historical
admixture. These approaches significantly improved our understanding of
past admixture histories using genetic data (e.g.
Baharian et al., 2016;
Martin et al., 2013).
Despite these major achievements, ML admixture history inference methods
suffer from inherent limitations acknowledged by the authors
(Gravel, 2012;
Hellenthal et al., 2014;
Lipson et al., 2013). First, most ML
approaches can only consider one or two pulses of admixture in the
history of the hybrid population. Nevertheless, admixture processes are
often expected to be much more complex, and it is not yet clear how ML
methods behave when they can consider only simplified versions of the
true admixture history underlying the observed data
(Gravel, 2012;
Hellenthal et al., 2014;
Lipson et al., 2013;
Loh et al., 2013;
Medina, Thornlow, Nielsen, &
Corbett-Detig, 2018; Ni et al., 2019).
Second, it is possible to statistically compare ML values obtained from
fitting models with different parameters to the observed data, as a
guideline to find the “best” model. Nevertheless, formal statistical
comparison of the success or failure of competing models to explain the
observed data is often out of reach of ML approaches
(Foll, Shim, & Jensen, 2015;
Gravel, 2012;
Ni et al., 2019). Finally, admixture-LD
methods, in particular, rely on fine mapping of local ancestry segments
in individual genomes and thus require substantial amounts of genomic
data, and, sometimes, accurate phasing, which remain difficult in
numerous case-studies.