Computational Performance
Finally, we compared the average performance of four tools in terms of
speed, memory efficiency, disk space usage, gene recovery, and accuracy
(Figure 3-A). GeneMiner demonstrated a significant speed advantage in
Test I, outperforming Easy353 and HybPiper, while aTRAM lagged due to
its reliance on BLAST tasks. GeneMiner also stood out for its efficient
memory usage, which users can easily manage by selecting different
subsets of reference sequences. The principal reason for memory
consumption when using GeneMiner is construction of the reference hash
table. The memory usage of this program depends on the variety of genes
and the similarity among the same genes from different species. When
there is high similarity between reference sequences of the same gene,
variations in the number of reference genes will not significantly
elevate memory usage. In Test I, Poaceae and Brassicaceae comprised
224,049 sequences from 347 species and 53,379 sequences from 349
species, respectively. Despite a higher quantity of reference genes in
Poaceae compared to Brassicaceae, the peak memory usage for Poaceae was
lower. This is mainly attributed to the lower variability of the 353
genes within the Poaceae family compared to Brassicaceae. In practice,
if the reference sequences only come from a closely related genus,
GeneMiner only requires approximately 0.33 GB of memory usage per
sample. Regarding disk space usage, GeneMiner was the second-best
performer in Test I, utilizing an average of 0.30 GB, slightly higher
than Easy353 (0.225 GB). In conclusion, GeneMiner’s speed, memory, and
disk usage are highly conducive for small servers and even personal
computers, facilitating the ease of gene retrieval tasks. GeneMiner’s
user-friendly computational attributes make it a desirable tool for
various users working with gene acquisition tasks.