Potential gene-interaction effects
To identify potential gene-gene interactions across the four primary adaptive chromosomes related to body size and maturity we conducted Generalized Multifactor Dimensionality Reduction (GMDR). Analyses were conducted for maturity using adult females from the WFA♀ (N=133) data set and for total length using the WFA♀ and T_BON (N=883) data sets. We used the software GMDR version 0.9 (Lou et al. 2007; Chen 2011) to conduct an exhaustive search for all possible one to four locus models. The best model was defined as the model with the maximal cross-validation consistency. For additional details on GMDR and analysis methods see Parker et al. (2019).
Results
The gene-interaction analysis using GMDR for egg mass in the WFA♀ collection identified Etr_464 (chromosome 1) as the best single-locus model (Table S10). However, this model was only identified in 6 of the 10 training data sets, indicating limited support. Additionally, cross-validation accuracy for higher order models (two-locus 4/10; three-locus 3/10; four-locus 2/10) indicated the lack of support for gene combinations associated with egg mass. This result contrasts with Parker et al. (2019) who found evidence for a two-locus interaction model including chromosomes 1 and 4 for egg mass in Klamath River collections of Pacific lamprey. The discrepancy may be explained by the differences in collections of Pacific lamprey that have recently initiated their freshwater migration (Parker et al. 2019) versus collection of individuals further upstream (herein). The latter data set likely contains a mixture of current year and hold-over individuals whereas the former contained only current year migrants.
For total length, the GMDR produced different results depending upon the data set (Table S10). The gene-interaction analysis for WFA♀ collection produced support for a single-locus model including Etr_1806 (Chromosome 4) with cross-validation accuracy (10/10) and testing balance accuracy (77%). Higher order models with more loci were not supported. In contrast, for T_BON the model with maximal cross-validation accuracy (10/10) and highest testing balance accuracy (73%) was a three-locus interaction model (Table S10). The testing balance accuracy for the one-locus model (Etr_5317/Chromosome 2) was 67%. A 5% increase in testing balanced accuracy was realized in a two-locus interaction model (72%) that included Etr_5317/Chromosome 2 and Etr_1806/Chromosome 4. However, only a 1% increase was observed in the three-locus interaction model (73%), which included Etr_5317/Chromosome 2, Etr_1806/Chromosome 4, and Etr_4281/Chromosome 22. Models involving four loci had considerably lower cross validation accuracy (3/10) indicating lack of support. Under the best three-locus model, if Etr_5317 = AA and Etr_1806 = AA and Etr_4281 = AA, or if Etr_5317 = AA and Etr_1806 = AA and Etr_4281 = AT then individuals are classified as large body size whereas all other genotype combinations are classified as small body size. Classifying T_BON individuals using these methods produces a mean total length for large body size of 681 mm and for small body size of 632 mm. The analysis of Parker et al. (2019) also suggested support for a two-locus interaction model for total length involving chromosomes 2 and 4.
References
Chen, G. B., Xu, Y., Xu, H. M., Li, M. D., Zhu, J., & Lou, X. Y. (2011). Practical and theoretical considerations in study design for detecting gene-gene interactions using MDR and GMDR approaches. PLoS ONE, 6, e16981.
Kielbasa SM, Wan R, Sato K, Horton P, Frith MC. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011; 21:487–93.
Lou, X-Y., Chen, G.B., Yan, L., Ma, J. Z., Zhu, J., Elston, R. C., & Li, M.D. (2007). A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. The American Journal of Human Genetics , 80, 1125-1137.
Tang H, Zhang X, Miao C, Zhang J, Ming R, Schnable J, Schnable P, Lyons E, Lu J. (2015) ALLMAPS: robust scaffold ordering based on multiple maps. Genome Biology 16(1):3
Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M., & Jaffe, D. B. (2017). Direct determination of diploid genome sequences. Genome research, 27(5), 757-767.
Zhao, H., Sun, Z., Wang, J., Huang, H., Kocher, J.-P., & Wang, L. (2013) CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics (Oxford, England), btt730.