Gene families and Phylogenetic analysis
OrthoMCL was used to identify the orthologous groups among 12 Gramineae species (O. kokonorica , C. songorica, O. thomaeum, E. tef, S. bicolor, Z. mays, Digitaria exilis, Panicum hallii, Setaria italica, Brachypodium distachyon, Hordeum vulgare, O. sativa ), two Commelinids species (Ananas comosus, Musa acuminata ), one Monocot species (Phalaenopsis equestris ) and one Rosid species (A. thaliana ). For the three tetraploids (i.e., O. kokonorica , C. songorica and E. tef ), both subgenomes were used for orthologous group construction and phylogenetic analysis. The dynamic evolution of gene families in these 16 species was predicted using CAFÉ v. 3.1 (Han et al., 2013), and the significantly expanded or contracted gene families were determined based on p-values (p < 0.01). We then completed GO enrichment and KEGG analyses on the expanded gene families in O.kokonorica .
Single-copy orthologous genes from the orthologous clustering results were extracted and aligned using MAFFT v. 7.158b (Katoh & Standley, 2013). Then, Gblocks v. 0.9171 (Castresana, 2000) was used to delete regions with poor alignment or large differences after multiple alignments. A maximum likelihood phylogenetic tree was reconstructed based on the single-copy orthologous gene data set using RAXML v. 8.1.17 (Stamatakis, 2014) with the PROTGAMMAJTTX model and 1,000 bootstrap replicates. Nucleotide substitution rate and divergence time were calculated by four-fold degenerate sites (4DTv) of single-copy orthologous genes. The nucleotide substitution rate was estimated using BASEML v. 4.8a, a program within PAML v. 4.0 (Yang, 2007). Species divergence time was inferred by MCMCtree in the PAML program, based on known approximate divergence times for P. equestris and M. acuminata (102-120 Ma) from the TimeTree database (http://www.timetree.org).