2.6 Gene prediction and annotation
de novo prediction and homologous gene search were used to
protein-coding gene annotation in the C. japonica genome.
Repeat-masked genome were used to subsequent analysis according to the
EVidenceModeler (EVM) v1.1.1 genome annotation pipeline (Haas et
al. , 2008). First, we used BRAKER v2
(https://github.com/Gaius-Augustus/BRAKER) to perform de novo gene
prediction. Second, the protein sequences of Lepidoptera insect were
downloaded from NCBI RefSeq as templates for homologous-based
predictions by GenomeThreader v 1.7.3 (Gremme et al. , 2005).
Finally, EVidenceModeler was used to integrate the above two evidence
with different weights and obtained the GFF3 format files. The number of
genes will be annotated finally.