3. RESULTS and DISCUSSION
102.775 sequences were aligned to reference mitogenome from polyadenylation selected RNA-seq data. 76.175 of those mapped to any end of any mitochondrial gene. 33 of 37 mitochondrial genes were represented with at least one mapped polyadenylated sequence. It has been determined that not only protein-coding genes but also rRNA and tRNA genes are polyadenylated in the Apis mellifera mitochondrial transcriptome. In general, polyadenyls of five to 124 bases long were added to mitochondrial genes. The highest representation for all genes were polyadenylated forms with 5-9 adenines in length (19%). Polyadenylated transcripts with 10-14 and 15-19 adenines were represented by 16% and 13%, respectively. The length of polyadenylation and the number of representative transcripts seemed to be inversely proportional in logarithmic scale (Figure 2, R2= 0,9578). Specifically, polyadenylation length and transcript counts of each gene were given in Figure 3.
Results obtained from polyadenylation selection and related properties were given as a table in Appendix -2.
trnQ, trnS1, trnE and trnL2 tRNA genes were not mapped to any sense or antisense polyadenylated end-read. trnQ, trnS1 and trnE genes are located quite close to the noncoding A+T rich region; thus, it may be due the reduced transcript abundance that any end-read were not mapped to these genes. On the other hand, it is remarkable that there was no mapped read to trnL2 gene, which is located between the most expressed genes cox1 and cox2 . cox2 gene was represented with 60%, cox1 gene with 25% and rrnL rRNA gene with 9% of the total end- reads; while others have less than one percent representation (Figure 4c). It is determined that the rRNA genes were more under-expressed than expected, probably because of the sequencing platform used with polyA selection.
While 99.19% of the mapped reads represent sense orientated transcripts, antisense transcripts were also found (Figure 4d). Sense transcripts were polyadenylated longer than antisense counterparts for each gene group (Figure 5). The presence of both sense and antisense transcripts indicated that functional RNA molecules were generated in insects, similar to 7S RNA and nd6 lncRNA in humans (Merceret al. , 2011). Furthermore, insect mitochondrial gene content was distributed equally in both heavy and light chains unlike human mitogenome which codes only nd6 and a few tRNAs in light chain. Thus, it can be speculated that the potential of producing noncoding- functional RNAs from insect mitogenomes can be much more likely compared to that in human mitogenomes.
rrnL, nd4 , rrnS, nd3 and trnV genes had only sense, and trnN, trnW and trnM genes had only antisense transcripts (Figure 4b). Antisense transcripts had 22.85 adenines and sense transcripts had 31.00 adenines on average in their polyadenyl tails. Our data showed variable polyadenylation length for sense and antisense transcripts of each gene, and patterns of polyadenylation were generally independent of the expression level (Figure 3, 4e and Appendix-2).
We found that some of mitochondrial genes predominantly appear to form longer or shorter transcripts than annotated formal gene length. 21 of 33 mitochondrial genes tended to be found in longer transcripts. Nine of these genes add whole sequence following the intergenic region (atp6 , cytB , nd4 , nd5 , nd6 , rrnL, trnN, trnP, trnY) and other eight of 21 add at least one base following the intergenic region (cox3 , nd1 , nd3 , nd4L , trnG, trnR, trnS2, trnT). The remaining four contain, if they have any intergenic region, add at least one base following mitochondrial gene in the transcript (atp8 , nd2 , trnC, trnW) (Figure 6).
Intergenic regions that are estimated as gene rearrangement residues, seem to be conserved despite the reduced nature of mitogenome inApis . Mitochondrial regulator elements of transcription can spread into whole genome, even into gene regions (Barshad et al. , 2018).Thus, it was suggested that intergenic regions may be considered as transcription control regions (Barshad et al. , 2018; Aydemir and Korkmaz, 2020).
Long RNA molecules were evaluated as a result of a mis-processing of transcription termination in human mitochondria (Temperley et al. , 2010). Although this assumption is acceptable for protein-coding genes, the results of the rrnL gene is remarkable: 86.25% and 94.02% of all rrnL transcripts contain whole sequence and half of whole sequence of neighboring intergenic region, respectively. These high rates may also suggest the possibility of incorrect annotation of this noncoding gene since RNA gene annotations were performed using PCG boundaries and there is no accepted, standard, consensus system for annotation (Stewart and Beckenbach, 2009). cox1 , cox2 , rrnS, trnG and trnK genes tend to form a transcript that contain only annotated sequences. Seven tRNAs seem to be transcribed mostly in truncated forms (trnA, trnD, trnH, trnI, trnL1, trnM, trnV). Truncated mRNA molecules are determined in human mitochondria, potentially produced by secondary structures that may occur within the genes and evaluated as residues of RNA surveillance pathway based on their low frequencies with 1/161 (Szczesny et al. , 2009; Mercer et al. , 2011). These polyadenylated truncated transcripts have been evaluated as a clue for the occurrence of polyadenylation-dependent RNA degradation processes in human mitochondria (Slomovic et al. , 2005). Nevertheless, it is determined that 9,80% of all mitochondrial transcripts of Apis represent truncated forms. In recent studies, it is supposed that truncated mitochondrial RNA molecules may be precursors for circRNA molecules and are not translated (Mance et al. , 2020). Unlike in humans, truncated forms were found not only in mRNAs but also in rRNA and tRNA transcripts. Truncated mitochondrial tRNA molecules were detected and evaluated as t-elements in a freshwater alga Cyanophora paradoxa (Salinas-Giegé, Ubrig and Drouard, 2021). It has been proposed that these truncated tRNA molecules potentially hybridize with neighboring mRNAs and thereby protect and stabilize mRNAs from exonuclease digestion (Salinas-Giegé, Ubrig and Drouard, 2021).
Accordingly, 75.93% of mitochondrial transcripts containing polyadenyl residues at their ends consist of only the annotated gene (gene + polyA/T), 11.42% of them consist of the gene and the entire nucleotides of following intergenic region (gene + IG + polyA/T), 9.78% consist of truncated genes (gen- X base + polyA/T), 2.40% consist of the gene and a part of the following intergenic region (gene + X base IG + polyA/T) and 0.47% consist of gene, and if any intergenic region, and a part of the neighbor gene (gene(+ IG)+ gene+ polyA/T) (Figure 4f).
Genes which transcribed in the same polycistronic unit did not show similar expression abundance in RPKM analysis. cox2 was the most expressed, while trnE is the least expressed mitochondrial gene, and general pattern in expression levels wascox2 >cox1 >> trnD> rrnL> trnL2> trnC> trnY> trnL1> trnW> nd6 > rrnS> trnI> nd2 > trnV>atp6 > nd1 > trnR>cytB > trnN> trnF> trnK> trnG> trnS2> trnM> cox3 > nd3 > trnS1> trnH> trnP>nd5 > trnA> atp8 >nd4 > trnQ> trnT>nd4L > trnE. RPKM value of each gene was given in Table 1 and Figure 4a. Genes with high RPKM values were found to exhibit variation in their polyadenylation profiles (Figure 2 and Appendix 2). While rRNA expression levels are expected to be the highest, the overexpression of cox2 and cox1 is remarkable (Gelfand and Attardi, 1981). cox genes, especially cox2 mRNA accumulation can be the result of DWV (deformed wing virus) infection ofApis in this study. It was found that the expression ofcox2 increases due to the modulation of prostaglandin E2 (PGE2) synthesis during viral infection (Steer and Corbett, 2003).
According to the results of ORF Finder analyses, 205 potential ORFs were found to have equal or longer than 75 nucleotides (25 amino acids) (Appendix- 3). 32 of these were verified with BLAST searches, and 13 of those were known OXPHOS genes. 13 and one of them were aligned with bacterial and insect hypothetic proteins, respectively, and need further validation to understand whether they are actually translated into proteins. Four and one of 32 seemed to be variant forms of cox1and atp8 genes, respectively.