Introduction
Genomics promises exciting advances towards understanding adaptive genetic variation and evolutionary potential of plants under a rapidly changing and often increasingly variable environment (Hoffmann & Sgrò 2011; Savolainen et al. 2013; Harrisson et al. 2014). Intraspecific genetic variation represents the potential for adaptive change in response to new selective challenges, which is critical for local species persistence under environmental change (Rice & Emery 2003; Bell & Gonzalez 2009). Adaptation to local climate conditions has been considered typical for tree populations (Langlet 1971; Ying & Liang 1994; Kitzmiller 2005; Wright 2007), but organisms with such long generation times and a sessile lifestyle can become maladapted if environmental shifts rapidly occur (Aitken et al. 2008; Andersonet al. 2012; Alberto et al. 2013). Plants also exhibit plastic changes in their growth form and physiology in response to stress, and the level of plasticity can itself be heritable (Van Kleunen & Fischer 2005; Auld et al. 2010) and may be under the selection (Zettlemoyer & Peterson 2021). Understanding the distribution of genetic variation related to environmental responses may help us better predict changes and manage forests in a shifting climate (Neale & Kremer 2011; Oney et al. 2013). This includes selecting seed sources for restoration or breeding that have desirable characteristics such as drought tolerance (Beaulieu et al. 2014; Isik 2014).
Landscape genomics offers enormous potential to discover genes responsible for local adaptation by investigating the statistical association between genetic variation at individual loci and the causative environmental factors (Eckert et al. 2010, 2015; Sorket al. 2013; Lu et al. 2019). This approach is sometimes known as Genotype-Environment Association (GEA) analysis. Prior studies in Arabidopsis – the primary plant model organism - have found that environmentally-associated SNPs can predict performance in common gardens (Hancock et al. 2011). A Pinus pinaster study suggests this could be true in trees as well, even when only a modest number of the genetic variants involved have been identified (Jaramillo-Correa et al. 2015). However, GEA studies don’t by themselves reveal why specific alleles are more prevalent in particular environments – for example, are they responsible for selectively favored traits? Genotype-Phenotype Association (GPA) analysis identifies loci linked to a specific phenotype (Eckert et al. 2009; Hollidayet al. 2010). In plant GPA studies, individuals are typically grown in a common environment to eliminate the effects of environmental variation on phenotypes. However, this approach does not reveal whether a trait variant would be favored in the field. GEA and GPA association are thus complementary, and combining them might better identify the loci and traits that are selectively favored in particular conditions than either could alone (Eckert et al. 2015; Mahony et al.2020).
The large genome size of conifer trees (>19 GBP) represents a challenge for analysis. Most association studies in conifers have focused on SNPs within a few hundred genes (Eckert et al. 2009, 2015; Holliday et al. 2010; Hamilton et al. 2013; Dillonet al. 2014; Housset et al. 2018), or fewer than 2,000 genome-wide SNPs (Uchiyama et al. 2013). One notable exception is a recent study on lodgepole pine that used a sequence capture dataset created by mapping the Pinus contorta transcriptome to theP. taeda genome sequence (Mahony et al. 2020). A genome-wide SNP climate-association study was also recently completed for P. lambertiana , one of the few other pines species with a full genome sequence (Weiss et al. 2022). Still, most conifers have neither a published genome sequence nor a complete transcriptome. Though targeted sequencing is efficient, candidate gene approaches may miss other vital genes with previously unsuspected roles in local adaptation, and focusing solely on variants within genes may miss significant variants within regulatory regions.
Several approaches to identifying more genetic variants for genome-wide association studies (GWAS) utilizing next-generation sequencing (NGS) have been proposed in recent years (Davey et al. 2011; Poland & Rife 2012). Genotyping-by-Sequencing (GBS), which can generate tens of thousands of SNP markers (Single Nucleotide Polymorphisms) without the need for a reference genome or whole transcriptome, has emerged as a cost-effective strategy (Elshire et al. 2011; Andrews et al. 2016). By combining the power of multiplexed NGS with restriction-enzyme-based genome complexity reduction, GBS can genotype large populations of individuals for thousands of SNPs in an increasingly rapid and inexpensive way (Poland et al. 2012; Poland & Rife 2012).
Despite the high economic and ecological importance of ponderosa pine (Pinus ponderosa ) in the western United States (Graham & Jain 2005), no previous study has attempted to identify the relationship between gene sequence variation and drought tolerance in this species. Some studies have investigated P. ponderosa’s evolutionary history and phylogeography using mitochondrial DNA markers; these reflect the long-term biogeographical process contributing to the modern distribution of the species but have limited adaptive significance in themselves (Johansen & Latta 2003; Potter et al. 2013). Other studies have emphasized the importance of intraspecific variation ofP. ponderosa in environmental responses but focus on the phenotypic variation within and among populations without identifying the underlying genetic variation (Kolb et al. 2016; Maguireet al. 2018). California’s historic 2012–2016 drought may represent an increasingly common condition as climate changes (Griffin & Anchukaitis 2014; Berg & Hall 2015). Such “hot droughts” can lead to mass tree mortality, even in relatively drought tolerant species like ponderosa pine, negatively impacting the sustainability of conifer forests (Fettig et al. 2019). A deep understanding of the genetic basis of adaptation in ponderosa pine and other western conifers is critical for successful reforestation and conservation programs.
In this study, we conducted a GEA analysis on 223 ponderosa pine genotypes from a range of climates across the central Sierra Nevada mountains of California. We then planted seeds collected from a subset of these trees in the greenhouse. The resulting seedlings provided the basis of a GPA analysis of putative drought-response traits. We ran gene annotation to ascribe biological function to the genes that the associated SNPs were in or adjacent to. Then we assessed overlap in SNP identity or gene functions among GEA and GPA association analysis that might indicate particular importance for local adaptation.