3.3 | Effects of using different in silico mate
pairs on genome assemblies of T. bimaculatus
Assembling the genome of T. bimaculatus , using only the
paired-end reads yielded a NGA50 and a complete BUSCO number of 4.7 kb
and 1,626, respectively, (Table 2). The original in silicomethod, as well as the optimized in silico method, improved the
genome assembly of T. bimaculatus , significantly. Compared to the
original in silico method (using one reference from the same
genus, ‘rub’: T. rubripes or ‘fla’: T. flavidus ), the
optimized in silico method (using two reference from the same
genus, ‘rub’ and ‘fla’) increased the NGA50 (rub*: 140.2 Kb; fla*: 131.4
Kb vs. rub-fla**: 183.8 Kb) and reduced misassemblies markedly
(rub*:5,143; fla*: 5,148 vs. rub-fla**: 4,188) with comparable number of
complete BUSCOs (rub*:2,358; fla*: 2,366 vs. rub-fla**: 2,367).
Compared to the original in silico method, the optimized in
silico method which generated conserved mate pairs using more than two
reference genomes (3 references: two from the same genus, ‘rub’, ‘fla’
and one from the same order, ‘nig’; 4 references: using two reference
from the same genus, ‘rub’ , ‘fla’, one reference from the same family,
‘nig’, and one reference from the same order, ‘mol’) drastically reduced
misassemblies (rub*: 5,143; fla*: 5,148, nig*: 5,843, mol*: 4,132 vs.
rub-fla-nig**: 2,159, rub-fla-nig-mol*: 1,796), but failed to increase
either the NGA50 (rub*: 140.2 Kb; fla*: 131.4 Kb, nig*: 7.2 Kb, mol*:
4.7 Kb vs. rub-fla-nig**: 7.5Kb, rub-fla-nig-mol*: 4.6 Kb) or the number
of complete BUSCOs (rub*:2,358; fla*:2,366, nig*:1,772, mol*:1,625 vs.
rub-fla-nig**: 1,842, rub-fla-nig-mol**: 1,671).
We compared the mate pairs
generated using one reference genome (T. rubripes ) with the
conserved mate pairs generated using two reference genomes (T.
rubripes and T. flavidus ). We found that the extra mate pairs
generated using one reference were mostly inverted on the target genome
(60.03% to 66.62%), while the remaining mate pairs either had length
deviation on the target genome or were mapped to different scaffolds of
the target genome (Table S12).