2.3 Identification of one-to-one orthologous gene sets
We predicted the coding sequences (CDSs) of each unigene according to Nr and Swissprot. And we extracted the longest open reading frame (ORF) in the longest transcript per gene. Estscan 3.0.3 software(Iseli et al., 1999) was also used to determine the direction of sequences that did not have aligned results, the CDSs extracted from these unigenes were translated into amino acid sequences with the standard codon table. We used OrthoMCL v2.0.3 (Li, 2003) (e-value=1e-3) to identify orthologous genes using a Markov Cluster algorithm (MCL) (Enright et al., 2002 JEB). The longest protein sequences per gene were used as the one-to-one orthologous genes among eight species in this study and analyzed in downstream analyses.