Genome-wide variation between O. kokonorica and C. songorica
Genome-wide variation, including SNPs, indels, duplications and structural variations (SVs), was identified based on whole-genome alignment with O. kokonorica and C. songorica genome assemblies. A total of 9,522,801 SNPs, 44,618 indels, 2768 duplications, and 925 inversions were identified using NUCDIFF (Figure 3A). Based on Assemblytics, we identified a total of 41,559 SVs, in which presence/absence variants (PAVs) accounted for 23.78%. Most SVs were detected in non-coding regions, while 26.57% of the SVs were present in exon regions (Figure S9), which could have affected gene function and led to divergence between the two genera. A total of 14,604 genes were categorized as SV-high-impact genes (i.e., the SV is assumed to have a high or disruptive impact on the protein by causing protein truncation, causing loss of function, or triggering nonsense mediated decay), were mainly enriched in ‘flower development’, ‘post-embryonic development’ and ‘regulation of gene expression, epigenetic’, and involved in ‘mismatch repair’, ‘cytochrome P450’ and ‘homologous recombination’ pathways (Figure S9).