Research design
The PLC/PRF/5 cell line is a well-established in vitro cell model for studying hepatitis B virus (HBV) integration and hepatocellular carcinoma7. Therefore, we initially collected the high-throughput sequencing datasets associated with PLC/PRF/5 cells from GEO (https://www.ncbi.nlm.nih.gov/geo/) and SRA databases (https://www.ncbi.nlm.nih.gov/sra).
Our previous work has identified the multiple HBV integration in PLC/PRF/5 cells using long-read whole-genome sequencing technology (PacBio)8. By integrating our previous PLC/PRF/5 DNA sequencing data obtained with the PacBio platform and the data acquired through targeted DNA long-read sequencing techniques in studies by Meng, Ricardo Ramirez’s studies20,21, we constructed a reference genome for HBV integration in the PLC/PRF/5 cell line. Furthermore, we conducted transcriptome long-read sequencing (ONT) to identify transcriptional characteristics of integrated HBV DNA in PLC/PRF/5 cells. Based on transcriptomic analysis results, we further employed epigenetic-related sequencing data, including WGBS, ChIP-seq, and ATAC-seq, to determine the relationship between the epigenetic modification and the transcription of integrated HBV DNA. Cellular experiments were conducted to validate the regulatory mechanisms exerting integrated HBV DNA transcription in PLC/PRF/5 cells. The analysis flow chart of this study is illustrated in Fig. 1. The datasets used in this study are summarized in Table S2.
The transcriptional landscape of integrated HBV DNA in PLC/PRF/5 cells
Due to limitations of low throughput and short read lengths, neither Sanger sequencing nor second-generation high-throughput sequencing can accurately characterize the transcriptional features of integrated HBV DNA in PLC/PRF/5 cells. Therefore, we employed long-read transcriptome sequencing base on ONT platform to sequence the transcripts from PLC/PRF/5 cells and investigated the transcriptional features of integrated HBV DNA.
A total of 7,238,100 cleaning reads was obtained, with an average length of 806 bp. Among these reads, 1630 were identified as chimeric fragments of the HBV DNA and human genome (Fig. S1). Initially, chimeric reads were aligned to the genome of each integrated HBV (previously acquired using PacBio WGS sequencing by our research group, as well as in the studies by Meng and Ricardo Ramirez) to assign a transcriptional signature to each integration event. As depicted in Fig. 2, the integrated HBV DNA on chr4 and t (1;8) (chromosomal fusion between chromosome 1 and 8) did not initiate transcription, while the remaining integrated HBV DNA fragments did. Specifically, the integrated HBV DNA on chr11, chr12, chr13, chr16, and chr17 utilized HBV promoters, while those on chr3 and chr17 employed host promoters.
The characteristic of HBV chimeric transcripts derived from different integrated HBV DNA
To investigate the transcriptional activity of HBV DNA at various integration sites in PLC/PRF/5 cells, we quantified chimeric transcripts originating from distinct integration sites using long-read transcriptome sequencing. The results showed that the transcription levels of HBV DNA on different integration sites vary substantially. The highest number of transcripts were derived from integrated HBV DNA on chr11, chr13, and t (5;13) (Fig. 3A), suggesting that these three integration sites exhibit more active HBV DNA transcription and are the primary sources of viral protein expression in the PLC/PRF/5 cells.
Next, we focus on the transcripts from integrated HBV DNA on different chromosomes in PLC/PRF/5 cells, revealing transcriptional features of the integrated HBV DNA. We discovered that the majority of integrated HBV DNA utilized promoters SP1, SP2, and XP to initiate transcription. Additionally, human polyA served as transcription terminal signals, generating chimeric PreS1, PreS2/S, and HBx transcripts (Fig. 3B). While the HBV integrated fragment contains multiple promoters (chr13, t (13:5)), the transcription of integrated HBV exhibits promoter selectivity (Fig. 3C & Fig. S2B). In addition, the RNA originating from the integrated HBV DNA further exhibited splicing properties (chr12), forming fusion transcripts with host exons (chr12. Fig. 3D). Furthermore, we observed that certain integrated HBV DNA initiate transcription with the host promoter (chr3), and transcription ceased via HBV canonical poly(A) (Fig 3E). Besides, multiple transcriptional modes can arise from the same integrated HBV DNA (Fig. 3F & Fig. S2A). Due to the presence of host homologous sequences between different integration sites, the origins of two types of transcripts cannot be determined with certainty (Fig. S2C & S2D).
To verify the accuracy of long-read sequencing results, we conducted reverse transcription-polymerase chain reaction (RT-PCR) experiments to detect HBV chimeric transcripts with the highest expression from chr13 & t (5;13) and the lowest expression from chr16. The locations of the primers and the sizes of the amplified fragments are shown in Fig. 3G and Table S3. The RT-PCR results reveal that a prominent amplification band of approximately 500 bp is visible for the chimeric transcript from chr13 & t (5;13), while the chimeric transcript from chr16 is nearly undetectable (Fig. 3H). This confirms that the transcription level of integrated HBV DNA on chr13 & t (5;13) is higher than that on chr16. Furthermore, we performed Sanger sequencing on the PCR amplification product of the HBV chimeric transcript from chr13 & t (5;13), and the obtained sequence consists ONT long-read sequencing (Fig. 3I), further validating the accuracy of the results of PLC/PRF/5 cell transcriptome long-read sequencing.
Taken together, these results suggest significant differences of transcriptional level and characteristic of integrated HBV DNA on various site and the preservation of promoters on integrated HBV DNA fragments is important for the transcription of integrated HBV DNA.
The effect of chromatin accessibility on the transcription of integrated HBV DNA is limited in PLC/PRF/5 cells
The accessibility of chromatin is an essential mechanism to regulate gene transcription22. To explore the correlation between chromatin states around HBV integration sites in PLC/PRF/5 cells and the transcriptional activity of integrated HBV DNA, we further analyzed the ATAC-seq dataset (GSM4217243) from PLC/PRF/5 cells. We divided the integrated HBV into two groups based on transcriptional activity and compared the chromatin accessibility levels around the integration sites (within 100 kb upstream and downstream). HBV DNA integrated into chr3, chr11, chr12, and chr13 is comparatively active in transcription and has been defined as the transcriptionally active group, whereas HBV DNA integrated into chr4 and chr8 shows low levels of transcription and has been defined as the transcriptionally inactive group. The results show that in some regions around transcriptional active integration sites, ATAC-seq signal are significantly higher than those around transcriptional inactive integration sites (Fig. 4A). Furthermore, we used the log2CPM of mapped reads in a certain region to normalize and measure the state of chromatin accessibility in that region. The correlation analysis between this value and the transcription level of HBV DNA at each integration site showed a positive correlation trend, but the correlation was not statistically significant (r = 0.3739, p = 0.3216) (Fig. 4B). Therefore, the role of chromatin accessibility in regulating integrated HBV DNA transcription is limited in PLC/PRF/5 cells.
The principle of ATAC-seq technology is based on the Tn5 enzyme binds and cuts open chromatin. When transcription factors were bound to DNA, it will prevent Tn5 cleavage in an otherwise nucleosome-free regions, resulting in small regions called footprints23. For instance, in the case of the transcription factor NFYC, Tn5 cleavage in these footprints decreases significantly within peak regions of high cleavage probability (Fig. 4C). Based on this, we used HINT-ATAC to predict transcription factors that potentially play a role in transcription of integrated HBV DNA in PLC/PRF/5 cells. We selected the 100 kb region upstream and downstream of the HBV integration site to predict and enrich transcription factors that may play a role in the vicinity of the integration site. As shown in Fig. 4D, NFYC, KLF15, FOS, JUN, and other transcription factors are enriched around the HBV integration site compared to randomly selected background regions. Since previous studies have reported that those transcription factors can bind to the HBV cccDNA promoter sequence and initiate HBV gene transcription24-26, we hypothesize that these transcription factors may bind to integrated HBV DNA or host DNA in the vicinity of the integration site, participating in the initiation of integrated HBV DNA transcription.
Methylation modification inhibits the transcription of integrated HBV DNA in PLC/PRF/5 cells
The methylation level of the genome is one of the important factors affecting gene transcription27. In order to explore whether the transcription of integrated HBV DNA is influenced by methylation levels, we firstly compared the average methylation levels of integrated HBV DNA with the average methylation levels of the whole human genome in PLC/PRF/5 cells. As shown in Fig. 5A, the average methylation level of integrated HBV DNA was lower than that of the human genome (HBV 19.7% vs. human 32.9%). Considering that the transcriptional activity of cccDNA in the form of mini-chromosomes in hepatocyte nuclei is also regulated by DNA methylation modifications, we further analyzed the methylation levels of cccDNA extracted from HBV-infected HepG2-NTCP cells (from dataset SRR10426842)28 and compared them with the methylation characteristics of integrated HBV DNA in the PLC/PRF/5 cells. The results showed that the methylation levels of CpG island 1, CpG island 2, and CpG island 3 on integrated HBV DNA29 were higher than those on the corresponding CpG islands of cccDNA (Fig. S3A, B, C). This result suggests that the episomal HBV genome itself may be at a low methylation level, and when HBV DNA is integrated into the human genome, it is methylated by the host methylation modification system, resulting in suppressed expression. This high methylation modification of foreign integrated DNA may be one of the mechanisms of host cell self-protection.
Next, we analyzed the differences in methylation levels of the host chromosome at various distances from the HBV integration sites in PLC/PRF/5 cells (500 bp, 1 kb, 2kb, 5 kb, 10kb, 20kb, and 50 kb). As shown in Fig. 5B, the average methylation level within the 500 bp region nearby HBV integration sites were relatively low compared to the genome at a distance from the integration site. Subsequently, we performed a correlation analysis between the average methylation level of the host genome within the 500 bp region nearby HBV integration sites and the number of HBV chimeric transcripts from each integration site. As shown in Fig. 5C, the transcription level of integrated HBV DNA was strongly negatively correlated with the methylation level (r = -0.8929, p = 0.0123). The host genome around the HBV integration sites on chromosomes 5, 11, 12, and 13 was relatively hypomethylated, as shown in Fig. 5B, and these sites exhibited high transcriptional activity of HBV DNA. Conversely, the methylation levels around the integration sites on chromosomes 3, 16, and 17 were relatively higher (see Fig. 5B), correlating with low transcriptional activity of the integrated HBV DNA. These results suggest that methylation status of the adjacent host genome affect the transcriptional level of integrated HBV DNA. 5-aza-2′-deoxycytidine (AzaD) is a cytidine deoxynucleo-9 side analogue. It inhibits DNA methyltransferases (DNMTs), leading to DNA hypomethylation including gene promoter regions
30. To further verify the effect of DNA methylation on the transcription of integrated HBV DNA, we used AzaD to inhibit the DNA methylation and detected the effects of AzaD on the expression of integrated HBV DNA. The results showed that AzaD treatment significantly upregulated the HBs RNA levels in PLC/PRF/5 cells, presenting a dose-dependent effect, with significantly higher HBs RNA levels at treatment concentrations of 1 μM and 4 μM (
Fig. 5D). Consistent with this result, the Western blot results also confirmed that AzaD treatment upregulated the intracellular HBsAg protein levels in a dose-dependent manner (
Fig. 5E). In addition, we also analyzed the transcription levels of HBV RNA at each integration site before and after
AzaD treatment using quantitative reverse transcriptase PCR (qRT-PCR). The results showed that the transcription level of HBV DNA integrated into chr17 significantly increased after AzaD treatment (
Fig. 5F), and this integration site has the highest level of methylation among HBV integration sites in the PLC/PRF/5 cell line (
Fig. 5B). These results further confirm that the transcription of integrated HBV in PLC/PRF/5 cells is regulated by methylation.
Transcription of integrated HBV DNA is associated with histone modifications in PLC/PRF/5 cells
Previous studies have demonstrated that H3K4Me3 is presented at the transcription start site (TSS) of actively transcribed genes, promoting transcription by rapidly recruiting RNA polymerase for mRNA synthesis, while H3K9Me3 is predominantly found in transcriptionally silent heterochromatin, hindering RNA polymerase access to the promoter region and inhibiting mRNA transcription31. To investigate whether the transcription of integrated HBV DNA is associated with histone modifications in PLC/PRF/5 cells, we analyzed the effect of modifications of H3K9Me3 and H3K9Me3 on transcription of integrated DNA in the PLC/PRF/5 using Chip-seq data from GEO database.
Firstly, to comprehend the distinct deposition of histone post-translational modifications (PTMs) on cccDNA and integrated HBV DNA, we compared the deposition patterns of two histone modifications on integrated HBV DNA in PLC/PRF/5 cells and cccDNA from liver biopsy samples of HBeAg-positive (HBeAg+) and HBeAg-negative (HBeAg-) CHB patient’s liver biopsy samples (Fig. 6A & Fig. S4). We observed three deposition features: first, the inhibitory H3K9me3 modification is less enriched on integrated HBV DNA, HBeAg+ cccDNA, and HBeAg- cccDNA compared to the H3K4me3; second, the deposition of H3K4me3 is relatively highly enriched and exhibits a similar pattern on integrated HBV DNA and HBeAg+ patient cccDNA, while its enrichment level is lower on HBeAg- patient cccDNA; third, on integrated HBV DNA and HBeAg+ cccDNA, the activating histone PTM H3K4me3 is highly enriched around promoter SP1 and SP2 and the S gene ORF, while the inhibitory histone PTM H3K9me3 is relatively less enriched. Taken together, the deposition pattern of histone PTM on integrated HBV DNA and HBeAg+ cccDNA is similar, with highly enriched activating histone PTM H3K4me3 and relatively low enrichment of inhibitory histone PTM H3K9me3, promoting active transcription of HBV DNA.
Next, we explored the correlation between the activating histone PTM H3K4me3 and the inhibitory histone PTM H3K9me3 on different integration sites of HBV DNA and integrated HBV DNA transcriptional activity. Since the ChIP-seq library was sequenced by NGS, limited by read length and shared overlapping sequences of integrated HBV DNA, we could only select histone PTM levels at the junction of HBV and the human genome to represent the histone PTM status at each integration site, and use the normalized read count of the HBV-host junction reads to represent the enrichment level of histone PTM. The result showed that the activating histone PTM H3K4me3 was positively correlated with integrated HBV DNA transcription levels, although not statistically significant (r = 0.6109, p = 0.0878), while the inhibitory histone PTM H3K9me3 enrichment level was negatively correlated with transcription levels (r = -0.7806, p = 0.0223) (Fig. 6B & 6C).
Furthermore, we analyzed the histone modification levels of the host genome within 5 kb upstream and downstream of the integration sites. The results revealed that integration sites with active transcription exhibited higher levels of activating histone PTMs, such as H3K4me1, H3K4me3, and H3K27ac, compared to transcriptionally inactive sites. Compared to H3K27ac, H3K4me1 and H3K4me3 had higher levels of activating histone modifications around integration sites (Fig. 6D). Furthermore, we analyzed the correlation between histone modification levels around each integration site and integrated HBV DNA transcription levels at each site. The analysis revealed a significant positive correlation between H3K4me3 modification levels near integration sites and integrated HBV DNA transcription levels (r = 0.809, p = 0.0083), while no significant correlation was found for H3K4me1 and H3K27ac modifications (Fig. 6E).
These findings suggest that histone modifications may play a crucial role in regulating integrated HBV DNA transcription.
Discussion
Previous studies have shown that integrated HBV DNA, has the potential to express HBsAg and truncated HBx protein, ultimately triggering HCC3. Despite numerous systemic studies revealed the transcriptional characteristics of HBV cccDNA, detail research regarding the transcriptional patterns of integrated HBV DNA remains insufficient. This study is based on genomic architecture of integrated HBV DNA in PLC/PRF/5 cells identified previously by WGS long read sequencing, and is the first to comprehensively determine the transcriptional landscape of each integrated HBV DNA in PLC/PRF/5 cells through RNA long-read sequencing.
We conducted transcriptome sequencing and obtained the transcription spectrum of integrated HBV DNA in PLC/PRF/5 cells for the first time. Ascribing HBV chimeric transcripts to the integrated HBV genome, we found that, the transcriptional activity of different integrated HBV DNA varies significantly. The most active transcription of HBV DNA occurred in those on chr11, chr13, and t (5;13). The integrated HBV DNA on chr4 cannot initiate transcription due to the absence of HBV promoters. This implies that and the preservation of promoters on integrated HBV DNA fragments is essential for the transcription of integrated HBV DNA. However, even though the integrated HBV DNA on chr8 retains complete SP1 and SP2 promoters, it does not undergo transcription. Furthermore, significant differences exist in the transcription levels of integrated HBV DNA on various sites, even with complete promoter preservation. This suggests that host factors may influence the transcriptional activity of integrated HBV DNA beyond viral promoter sequences.
According to previous research, HBV is more likely to integrate into transcriptionally active regions32, indicating that chromatin accessibility around integration sites may impact the transcription of integrated HBV DNA. Through analyzing the ATAC-seq data of PLC/PRF/5 cells, we discovered that the chromatin around transcriptionally active integration sites were more accessible than that around integration sites without transcription. However, correlation analysis showed that the accessibility of chromatin near the integrated HBV DNA does not have a significant correlation with its transcription level, suggesting that the accessibility of chromatin has limited regulatory effects on the transcription of integrated HBV DNA in the PLC/PRF/5 cell line. Furthermore, some transcription factors were predicted that may function around HBV integration sites, including NFYC, KLF15, FOS, JUN, and JUNB by ATAC-seq data. Previous research has shown that these transcription factors can bind to HBV promoters and enhance their transcriptional activity. For instance, NFYC can bind to HBV's Sp1 promoter26, KLF15 can bind to HBV's Sp1 and CP promoters24, and JUN-FOS can bind to the Sp2 promoter25. These results suggest that transcription factors may regulate the transcription of integrated HBV DNA in the PLC/PRF/5 cell line, but still need further experimental verification to validate their actual functions on the transcription of integrated HBV.
In eukaryotic cells, CpG islands in the promoter regions of host genes are modified under the action of methyltransferases, thereby inhibiting gene transcription33. The literature reports that cccDNA contains three CpG islands, which also undergo methylation and affect virus replication34。Watanabe’ study has found that the methylation level of integrated HBV DNA is strongly correlated with the methylation level of the adjacent host genome35, Watanabe's study has found that the methylation level of integrated HBV DNA is strongly correlated with the methylation level of the adjacent host genome, but it is still unclear whether the methylation of integrated HBV is regulated by the host。Our study found that the methylation level of integrated HBV DNA is higher than that of cccDNA. This result suggests that the episomal HBV genome itself may be at a low methylation level, and when HBV DNA is integrated into the human genome, it is methylated by the host methylation modification system, resulting in suppressed expression. This high methylation modification of foreign integrated DNA may be one of the mechanisms of host cell self-protection. However, it has not yet been elucidated whether methylation affects the transcription of integrated HBV. In this study, we found that the methylation level of integrated HBV DNA was higher than that of cccDNA but lower than that of the host genome. In addition, we observed a significant negative correlation between methylation of host genome around integration sites and HBV DNA transcription levels. These results suggest that the HBV genome is methylated to inhibit transcription after it is integrated into the host genome, and methylation modification is an important mechanism affecting the replication of integrated HBV.
Histone PTMs is another epigenetic modification which plays an important role in regulating gene transcription, occurring in both cccDNA and the host genome36. In this study, we found that the H3K4me3 in the S gene region of integrated HBV DNA in PLC/PRF/5 cells had a similar deposition pattern to that of cccDNA from HBeAg+ patients, which suggests both integrated HBV DNA in PLC/PRF/5 cells and cccDNA from HBeAg+ patients have a higher transcription activity. Furthermore, we observed a highly enrichment of inhibitory histone PTM H3K9me3 in integrated HBV DNA without transcription on t (1;8), indicating that the chromatin around this integration site was in a state of heterochromatin and consequently inhibiting the transcription of HBV DNA. Additionally, we observed that on the integrated HBV DNA in the PLC/PRF/5 cell line, the deposition of H3K4me3 and H3K9me3 seems to be mutually exclusive. However, in the liver tissues of patients with chronic hepatitis B, H3K4me3 and H3K9me3 are not always mutually exclusive, indicating that the epigenetic modifications of HBV cccDNA in the human liver are more complex.
In addition to the promoter of integrated HBV, methylation levels, and histone modification levels, we also focused on the impact of the cell cycle on HBV transcription in PLC/PRF/5 cells. We found that, compared to PLC/PRF/5 cells cultured with 10% FBS, those cultured with 0.5% FBS experienced cell cycle arrest, and at the same time, the transcription level of integrated HBV was increased (Fig. S5).
This study also has many limitations. In our analysis of the correlation between histone modifications of integrated HBV DNA and its transcription levels, we only included data for H3K4me3 and H3K9me3 histones. Although these two histones are representative, future studies will need to analyze correlations with more histone modifications to comprehensively validate the impact of epigenetic modifications on the transcription of integrated HBV DNA. Additionally, HBV DNA fragments that are too short or transcribed from homologous regions cannot be accurately sourced, which will affect the precision of result analysis. In this study, we focused solely on the transcriptional regulation of HBV integration in the PLC/PRF/5 cell line. In the future, it is necessary to analyze the transcriptional regulation of HBV integration in primary hepatocytes under natural infection conditions using single-cell multi-omics technologies.
In summary, through comprehensive analysis, we successfully constructed an integration and transcription map of HBV integration in the PLC/PRF/5 cell line. Our research confirms that the structure of integrated HBV DNA, methylation levels, and the modification levels of H3K4me3 and H3K9me3 histones affect the transcription of integrated HBV DNA. These results suggest that the regulation of epigenetic modifications in integrated HBV DNA could lead to transcriptional silencing and provide novel insights to improve the functional cure rate for CHB patients while reducing the incidence of liver cancer.